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ABSTRACT 


Computational  Conceptual  Change:  An  Explanation-Based  Approach 

Scott  Friedman 

The  process  of  conceptual  change  -  whereby  new  knowledge  is  adopted  in  the  presence  of  prior, 
conflicting  knowledge  -  is  pervasive  in  human  cognitive  development,  and  contributes  to  our 
cognitive  flexibility.  At  present,  Artificial  Intelligence  systems  lack  the  flexibility  of  human 
conceptual  change.  This  is  due  in  part  to  challenges  in  knowledge  representation,  belief  revision, 
abduction,  and  induction.  In  addition,  there  are  disagreements  in  the  cognitive  science 
community  regarding  how  people  represent,  use,  and  revise  their  mental  models  of  the  world. 

This  work  describes  a  cognitive  model  of  conceptual  change.  The  claims  are  that  (1) 
qualitative  models  provide  a  consistent  computational  account  of  human  mental  models,  (2)  our 
psychologically  plausible  model  of  analogical  generalization  can  leam  these  models  from 
examples,  and  (3)  conceptual  change  can  be  modeled  by  iteratively  constructing  explanations 
and  using  meta-level  reasoning  to  select  among  competing  explanations  and  revise  domain 
knowledge.  The  claims  are  supported  by  a  computational  model  of  conceptual  change,  an 
implementation  of  our  model  on  a  cognitive  architecture,  and  four  simulations. 

We  simulate  conceptual  change  in  the  domains  of  astronomy,  biology,  and  force  dynamics, 
where  examples  of  psychological  conceptual  change  have  been  empirically  documented.  Aside 
from  demonstrating  domain  generality,  the  simulations  provide  evidence  for  the  claims  of  the 
thesis.  Our  simulation  that  learns  mental  models  from  observation  induces  qualitative  models  of 
movement,  pushing,  and  blocking  from  observations  and  performs  similar  to  students  in 
problem-solving.  Our  simulation  that  creates  and  revises  explanations  about  the  changing  of  the 
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seasons  shows  that  our  system  can  assemble  and  transform  mental  models  like  students.  Our 
simulation  of  textbook  knowledge  acquisition  shows  that  our  system  can  incrementally  repair 
incorrect  knowledge  like  students  using  self-explanation.  Finally,  our  simulation  of  learning  and 
revising  a  force-like  concept  from  observations  shows  that  our  system  can  use  heuristics  and 
abduction  to  revise  quantities  in  a  similar  manner  as  people.  The  performance  of  the  simulations 
provides  evidence  of  (1)  the  accuracy  of  the  cognitive  model  and  (2)  the  adaptability  of  the 
underlying  cognitive  systems  that  are  capable  of  conceptual  change. 
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Chapter  1:  Introduction 


“We  are  like  sailors  who  on  the  open  sea  must  reconstruct  their  ship  but  are  never  able  to  start 
afresh  from  the  bottom.  Where  a  beam  is  taken  away  a  new  one  must  at  once  be  put  there,  and 
for  this  the  rest  of  the  ship  is  used  as  support.  In  this  way,  by  using  the  old  beams  and  driftwood 
the  ship  can  be  shaped  entirely  anew,  but  only  by  gradual  reconstruction.” 

-  Otto  Neurath  (in  Quine,  1960) 


Neurath’s  analogy  between  rebuilding  a  ship  at  sea  and  lifelong  learning  communicates  several 
important  insights.  Like  sailors  reconstructing  their  ship,  we  can  repair  our  intuitive  knowledge 
to  become  more  scientifically  correct.  We  are  constrained  by  the  need  for  support:  as  beams  on 
the  ship  require  the  support  of  adjacent  beams,  so  does  our  understanding  of  observations  rely  on 
the  support  of  explanations.  Consequently,  the  transformations  of  the  ship  and  our  knowledge 
involve  the  revision  of  components  and  the  transition  of  support.  In  cognitive  science,  this 
transfonnation  process  is  known  as  conceptual  change.  Following  diSessa’s  (2006) 
characterization,  conceptual  change  is  the  process  of  building  new  ideas  in  the  context  of 
existing,  conflicting  ideas.  This  is  differentiated  from  skill  learning  (since  skills  involve 
procedural  knowledge)  and  from  the  tabula  rasa  acquisition  of  knowledge  (hence  the  emphasis 
on  “change”).  This  also  does  not  include  filling  gaps  in  incomplete  knowledge  (Chi,  2008)  or 
enriching  (i.e.,  adding  detail  to)  existing  knowledge  (Carey,  1991).  We  provide  examples  of 
conceptual  change  to  help  illustrate. 
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One  well-documented  example  of  conceptual  change  is  the  changing  concept  of  force  in 
children  (Ioannides  and  Vosniadou,  2002;  diSessa  et  ah,  2004).  When  students  enter  the 
classroom,  they  have  intuitive  concepts  of  force  learned  from  experience  and  interaction.  One 
intuitive  theory  is  that  forces  act  on  objects  to  keep  them  translating  or  rotating,  and  then 
gradually  die  off  -  similar  to  the  medieval  concept  of  impetus  (McCloskey,  1983).  While 
scientifically  incorrect,  this  concept  of  force  is  still  productive  for  understanding  and 
manipulating  the  world,  which  is  probably  why  it  is  so  resilient  to  change.  Through  education, 
students  revise  these  intuitive  concepts  incrementally,  although  often  unsuccessfully.  Even  after 
learning  scientifically  correct  quantitative  aspects  of  force  such  as  F  =  m*a,  students  often 
operate  with  the  same  incorrect  qualitative  theories  when  labeling  forces  and  drawing  projectile 
trajectories  (Clement,  1985;  Hestenes  et  al.,  1992). 

Revising  the  concept  of  force  involves  revising  the  specification  (diSessa  et  al.,  2004)  of  the 
category.  The  specification  includes  the  conditions  under  which  a  force  exists,  the  consequences 
of  a  force’s  existence,  how  forces  are  combined,  and  the  relationship  of  a  force  to  other 
quantities  (e.g.,  mass,  velocity,  acceleration).1  For  example,  there  is  evidence  that  novices 
frequently  conceive  of  force  as  a  substance-like  quantity  (Reiner  et  al.,  2000)  that  can  be 
acquired,  possessed,  transferred,  and  subsequently  lost  by  physical  objects.  Changing  force  from 
this  intuitive,  substance-like  specification  to  a  Newtonian  specification  requires  changing  the 
conditions  and  consequences  of  a  force’s  existence,  the  model  of  how  forces  combine,  and  the 
relationship  of  force  to  acceleration  and  mass.  We  refer  to  this  type  of  conceptual  change  as 
category  revision,  and  we  discuss  this  further  in  Chapter  8.  An  example  of  category  revision  is 
differentiating  heat  and  temperature.  This  has  been  characterized  in  the  history  of  science  (Wiser 
and  Carey,  1983)  as  well  as  within  individual  students  (Wiser  and  Amin,  2001):  the  words  “heat” 

1  Reif  (1985)  refers  to  this  as  the  ancillary  knowledge  of  a  quantity. 
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and  “temperature”  are  at  first  used  interchangeably,  and  then  this  over-general  concept  is  revised 
into  two  specific  concepts,  resulting  in  an  intensive  concept  of  temperature  and  an  extensive 
concept  of  heat. 

There  is  considerable  disagreement  among  cognitive  scientists  on  how  this  type  of 
conceptual  change  occurs:  do  categories  actually  get  directly  shifted  (e.g.,  Chi,  2008)?  Are  they 
added  as  additional  categories  that  coexist  alongside  the  prior  category  (e.g.,  diSessa  and  Sherin, 
1998)?  Do  the  new  and  the  old  categories  coexist,  but  in  different  conceptual  systems  (e.g., 
Carey,  2009)?'  If  new  and  old  categories  coexist  somehow,  people  seem  to  understand  that  they 
are  mutually  incoherent,  perhaps  due  to  belief-level  refutation  (Chi,  2008)  or  incompatibility 
between  the  vocabularies  (Carey,  2009).  Regardless  of  whether  and  how  information  coexists, 
any  cognitive  model  of  conceptual  change  must  explain  how  people  come  to  use  a  new 
conceptual  vocabulary  (e.g.,  Newtonian  force)  in  place  of  an  old  vocabulary  (e.g.,  impetus-like 
force). 

The  second  type  of  conceptual  change  we  simulate  is  mental  model  transformation  (Chi, 
2008).  This  involves  revising  the  causal  knowledge  about  physical  systems  in  our  long-term 
memory,  which  are  often  referred  to  as  mental  models  (Gentner  &  Stevens,  1983).  Suppose  a 
student  has  the  common  misconception  that  blood  flows  in  a  single  loop  in  the  human  circulatory 
system:  from  the  heart  to  the  rest  of  the  body,  and  then  back  again,  to  be  oxygenated  by  the  heart 
(Chi  et  ah,  1994a).  Revising  this  mental  model  of  the  circulatory  system  to  include  a  second 
loop  -  from  the  heart  to  the  lungs  for  oxygenation,  and  then  back  -  involves  a  transformation  of 
this  knowledge.  This  is  not  merely  filling  a  gap  in  incomplete  knowledge,  since  the  old  and  new 
models  of  the  circulatory  system  make  conflicting  predictions.  This  type  of  conceptual  change 
has  also  been  characterized  in  the  domains  of  biology  (Carey,  1985;  Keil,  1994;  Inagaki  & 

2  Chapter  2  discusses  this  and  other  points  of  disagreement  and  divergence  in  theories  of  conceptual  change. 
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Hatano,  2002),  the  shape  of  the  earth  (Vosniadou  and  Brewer,  1992),  the  changing  of  the  seasons 
(Atwood  &  Atwood,  1996;  Sherin  et  ah,  2012),  and  others.  Both  types  of  conceptual  change  - 
category  revision  and  mental  model  transformation  -  are  the  ubiquitous  results  of  our  attempts  to 
reconcile  new  observations  and  instructions  into  our  existing  belief  system. 

Conceptual  change  is  pervasive  in  our  cognitive  development  and  education,  and  contributes 
to  the  flexibility  of  human  thought  over  time.  The  same  is  not  true  for  Artificial  Intelligence  (AI) 
systems;  at  present,  AI  systems  are  brittle  (McCarthy,  2007)  in  that  they  often  malfunction  when 
faced  with  new  types  of  tasks  and  unexpected  observations.  Many  researchers  in  the  field 
believe  this  can  be  fixed  by  making  the  central  cognitive  architecture  of  AI  systems  more 
flexible  and  adaptable  (e.g.  Nilsson,  2005;  Cassimatis,  2006).  We  believe  that  conceptual 
change  is  an  important  consideration  for  building  more  adaptable  AI. 

Modeling  conceptual  change  will  have  a  number  of  practical  applications.  For  example, 
scientific  discovery  systems  would  benefit  from  having  more  flexible  representations,  whether 
using  machines  as  collaborators  (e.g.,  Langley,  2000),  as  automated  scientists  (e.g.,  Ross,  2009; 
Langley,  1983),  or  as  mathematicians  (e.g.,  Lenat  &  Brown,  1984).  Intelligent  tutoring  systems 
will  benefit  similarly  -  if  a  tutoring  system  can  model  a  student’s  intuitive  knowledge3  and 
model  the  process  of  conceptual  change,  it  can  help  guide  the  student  through  difficult  learning. 
Finally,  conceptual  change  will  affect  how  we  interact  with  intelligent  agents.  As  eloquently  put 
by  Lombrozo  (2006),  explanations  are  the  currency  with  which  we  exchange  beliefs.  Conceptual 
change  -  and  explanation  construction,  which  is  part  of  our  conceptual  change  model  -  will  help 
AI  systems  exchange  the  same  explanatory  “currency”  as  people.  Specifically,  this  will  help  an 
AI  system  (1)  construct  explanations  that  are  understandable  by  humans,  (2)  represent 

3  See  Anderson  &  Gluck  (2001)  for  how  one  type  of  tutoring  system  models  students’  procedural  mathematics 
knowledge.  Procedural  knowledge  is  not  included  in  our  model  of  conceptual  change. 
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explanations  provided  by  humans  and  other  resources  (e.g.,  textbooks),  and  (3)  revise  beliefs  and 
explanations  as  humans  do,  for  more  intuitive  interaction. 

Given  these  benefits  to  human-level  AI  research  and  applied  AI  systems,  why  not  provide 
these  systems  with  a  computational  model  of  human  conceptual  change?  Unfortunately,  such  a 
computational  model  does  not  yet  exist.  I  believe  this  is  due  to  two  general  obstacles:  (1)  the 
complexity  of  human  conceptual  change  and  (2)  disagreements  in  the  cognitive  science 
community  how  conceptual  change  occurs.  Human  conceptual  change  is  complex  in  that  it 
involves  constructing  explanations  (Chi  et  al.,  1994a),  revising  beliefs  and  explanations  (Sherin 
et  al.,  2012;  Vosniadou  &  Brewer,  1992),  analogy  (Gentner  et  al.,  1997;  Brown  &  Clement, 
1989),  and  decision-making  about  new  infonnation  (Chinn  &  Brewer,  1998).  The  major  points 
of  contention  in  the  cognitive  science  literature  involve  the  representation  of  conceptual 
knowledge  (Forbus  &  Gentner,  1997;  Nersessian,  2007),  the  organization  of  conceptual 
knowledge  (diSessa  et  al.,  2004;  Ioannides  &  Vosniadou,  2002),  and  the  mechanisms  of  change 
(Ohlsson,  2009;  Chi  and  Brem,  2009;  diSessa  and  Sherin,  1998).  Fortunately,  advances  in 
cognitive  science,  both  theoretical  and  empirical,  have  reached  the  point  where  modeling  this 
complex  phenomenon  is  now  more  feasible. 

This  dissertation  presents  and  evaluates  an  integrated  model  of  human  conceptual  change. 
The  evaluation  of  our  computational  model  and  criteria  for  success  rely  upon  its  accuracy  in 
explaining  and  predicting  human  learning  and  problem-solving.  In  each  simulation,  the  system 
starts  with  similar  knowledge  as  people,  it  is  given  similar  stimuli  for  learning  as  people,  and  its 
knowledge  is  evaluated  using  similar  problem-solving  tasks  as  people.  By  comparing  the 
system’s  problem-solving  performance  with  those  of  students  described  in  the  literature,  we  can 
detennine  whether  the  system  can  leam  along  a  humanlike  trajectory  of  misconceptions  and 
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scientific  theories.  We  simulate  different  students  by  varying  the  system’s  starting  knowledge 
and  altering  simulation  parameters.  Success  is  determined  by  the  range  of  student  trajectories 
our  system  can  match  using  this  strategy  across  simulation  trials. 

Our  cognitive  model  makes  a  number  of  psychological  assumptions  concerning  human 
perception,  knowledge  representation,  reasoning,  and  learning.  We  hold  these  core  assumptions 
constant  across  all  four  simulations,  and  describe  them  later  in  this  chapter.  In  addition,  each 
simulation  makes  task-specific  assumptions.  Some  of  these  core  and  task-specific  assumptions 
are  needed  to  deal  with  current  limitations  of  the  conceptual  model:  for  example,  in  some  cases 
the  model  retains  more  information  about  a  learning  experience  than  is  likely  for  humans.  These 
interim  assumptions  provide  explicit  opportunities  for  extending  this  research.  Half  of  the 
simulations  use  automatically  generated  training  and  testing  data,  and  half  use  hand-coded  data 
based  on  evidence  from  the  literature.  Both  of  types  of  data  make  assumptions  about 
psychological  knowledge  encoding  that  are  discussed  below. 

This  dissertation  is  structured  as  follows.  The  rest  of  Chapter  1  is  focused  on  the  problem  of 
conceptual  change,  the  central  theoretical  claims  of  this  dissertation,  and  the  high-level 
psychological  assumptions  of  this  cognitive  model.  Chapter  2  discusses  other  theories  of  human 
conceptual  change  in  the  cognitive  psychology  literature.  Chapter  3  reviews  the  AI  theories  and 
techniques  used  in  our  computational  model.  Chapter  4  presents  the  model  of  conceptual  change 
and  defines  the  terminology  and  algorithms  used  in  the  simulations.  The  model  of  conceptual 
change  is  a  novel  contribution  of  this  dissertation,  but  it  builds  upon  the  existing  AI  technologies 
described  in  Chapter  3.  Chapters  5-8  discuss  four  simulations:  Learning  intuitive  mental  models 
(Chapter  5);  mental  model  transformation  as  explanation  revision  (Chapter  6);  mental  model 
transformation  from  a  textbook  passage  (Chapter  7);  and  category  revision  for  changing  a 
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concept  of  force  (Chapter  8).  Chapter  9  revisits  the  claims,  outlines  some  related  work,  and 
explores  some  objections,  limitations,  and  opportunities  for  future  work.  The  appendices  contain 
additional  algorithms  and  material  for  replication  of  the  work  described  here. 
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Figure  1.  Correspondences  between  psychological  and  artificial  entities  in  this  dissertation. 

1.1  Claims 

In  this  section  we  state  the  three  principal  claims  of  this  dissertation  and  outline  how  these  claims 
are  supported.  In  discussing  our  claims  and  presenting  our  cognitive  model,  it  is  important  to 
clarify  when  we  are  referring  to  people  and  when  we  are  referring  to  AI  systems.  We  include 
Figure  1  to  prevent  ambiguity.  For  the  remainder  of  this  dissertation,  “human”  and 
“psychological”  will  refer  to  humans,  “AI”  and  “artificial”  will  refer  to  the  computational  model, 
and  “agent”  will  refer  to  both.  The  first  claim  concerns  how  to  represent  human  mental  models 
in  an  AI  system: 
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Claim  1 :  Compositional  qualitative  models  provide  a  consistent  computational  account  of 
human  mental  models. 

By  “consistent  computational  account”  we  mean  that  compositional  qualitative  models  can 
consistently  explain  how  people  solve  problems  and  construct  explanations  in  multiple  domains. 
Since  Claim  1  is  a  knowledge  representation  claim,  it  can  be  tested  by  (1)  observing  how  people 
construct  explanations  and  solve  problems  with  their  mental  models  and  (2)  using  compositional 
qualitative  models  to  construct  the  same  explanations  and  solve  the  same  problems.  Claim  1  is 
not  a  new  idea  -  in  fact,  human  mental  models  were  one  of  the  initial  motivations  for  qualitative 
modeling  in  AI  (Forbus  &  Gentner,  1997);  however,  we  include  this  claim  in  the  dissertation 
because  we  offer  considerable  novel  evidence  to  support  it  (i.e.,  the  simulations  in  Chapters  5-8) 
and  the  other  claims  rely  upon  it.  We  provide  an  overview  of  compositional  qualitative  models 
in  AI  in  Chapter  3. 

This  dissertation  includes  a  simulation  of  how  people  leam  mental  models  from  a  sequence 
of  observations,  described  in  Chapter  5.  With  respect  to  Claim  1,  this  simulation  uses  qualitative 
models  to  simulate  human  mental  models,  but  it  also  relies  on  an  analogical  learning  algorithm 
called  SAGE.  SAGE  is  a  psychologically  plausible  model  of  analogical  generalization  -  that  is, 
it  abstracts  the  common  relational  structure  across  multiple  cases.  We  discuss  SAGE  further  in 
Chapter  3,  but  it  is  a  component  of  the  next  claim. 

Claim  2\  Analogical  generalization,  as  modeled  by  SAGE,  is  capable  of  inducing  qualitative 
models  that  satisfy  Claim  1 . 
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Claim  2  is  a  novel  claim,  since  AI  systems  have  not  previously  induced  qualitative  models  by 
these  means.  Claim  2  is  supported  by  the  simulation  described  in  Chapter  5. 

The  third  claim  involves  modeling  the  two  types  of  conceptual  change  described  above: 

Claim  3:  Human  mental  model  transformation  and  category  revision  can  both  be  modeled 
by  iteratively  (1)  constructing  explanations  and  (2)  using  meta-level  reasoning  to  select 
among  competing  explanations  and  revise  domain  knowledge. 

Claim  3  relies  on  the  terms  explanation,  meta-level,  and  domain  knowledge.  We  define  these 
terms  here  with  a  simple  example  to  clarify  this  claim.  We  define  these  same  terms  more 
precisely  in  Chapters  3  and  4.  We  intentionally  avoid  the  word  “theory”  when  referring  to 
human  knowledge,  since  this  word  has  been  used  to  describe  (1)  systematic  science  knowledge, 
(2)  “intuitive  theories”  of  novices,  and  (3)  “domain  theories”  of  model-based  reasoning  systems. 
We  can  thereby  avoid  conflating  these  distinct  concepts. 

We  test  Claim  3  by  building  a  computational  model  and  evaluating  it  according  to  the 
criteria  put  forth  by  Cassimatis,  Bello,  and  Langley  (2008):  (1)  the  model’s  ability  to  reason  and 
leam  as  people  do;  (2)  the  breadth  of  situations  in  which  it  can  do  so,  and  (3)  the  parsimony  of 
mechanisms  it  posits  (i.e.,  using  the  same  mechanisms  across  domains  and  tasks). 

In  this  dissertation,  domain  knowledge  is  comprised  of  one  or  more  of  the  following: 
propositional  beliefs  (i.e.,  a  statement  that  evaluates  to  true  or  false),  quantities  (e.g.,  a 
specification  of  “force”),  and  mental  model  parts  (see  Figure  1  for  modeling  vocabulary  and 
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Figure  2  for  examples).4  Consider  the  two  sets  of  domain  knowledge  Da  and  Db  about  the  human 
circulatory  system  in  Figure  2  which  are  simplified  accounts  of  student  knowledge  (Chi  et  al., 
1994a). 


Da :  single  loop 

Db’.  double  loop 

Propositional 

beliefs 

Blood  is  a  type  of  liquid 

The  heart  contains  blood 

Blood  is  a  type  of  liquid 

The  heart  contains  blood 

Arteries  channel  blood  from  the  heart 

Arteries  channel  blood  from  the  heart 

Veins  channel  blood  to  the  heart 

Veins  channel  blood  to  the  heart 

The  heart  oxygenates  blood 

The  lungs  oxygenate  blood 

All  blood  leaving  the  heart  flows 
directly  to  the  rest  of  the  body 

Some  blood  leaving  the  heart  flows 
directly  to  the  rest  of  the  body 

All  blood  leaving  the  rest  of  the  body 
flows  directly  to  the  heart 

All  blood  leaving  the  rest  of  the  body 
flows  directly  to  the  heart 

Some  blood  leaving  the  heart  flows 
directly  to  the  lungs 

All  blood  leaving  the  lungs  flows 
directly  to  the  heart 

Mental  model 

Fluid-flow 

Fluid-flow 

parts 

Infusing-compound-into-liquid 

Infusing-compound-into-liquid 

Consuming-compound-from-liquid 

Consuming-compound-from-liquid 

Quantity  specs. 

none 

none 

Figure  2.  Two  intuitive  accounts  of  the  human  circulatory  system.  They  share  propositional  beliefs  and 
mental  model  parts,  but  some  propositional  beliefs  in  Da  are  inconsistent  with  those  in  Dh. 

Account  Da  contains  beliefs  that  blood  flows  from  the  heart  to  the  rest  of  the  body  and  back 
-  and  nowhere  else.  Account  Db  contains  beliefs  that  blood  also  flows  from  the  heart  to  the 


4  We  assume  -  as  discussed  later  in  this  chapter  -  that  mental  models  are  divisible  into  reusable  components.  We 
simulate  these  using  compositional  model  fragments,  each  of  which  represents  a  process  or  conceptual  entity  (see 
Chapter  3  for  model  fragment  overview). 
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lungs  and  back.5  Both  accounts  share  some  propositional  beliefs  and  mental  model  pieces,  but 
some  propositional  beliefs  of  Da  are  inconsistent  with  those  of  Db. 

An  explanation  is  a  set  of  domain  knowledge  that  is  joined  by  logical  justifications6  to 
explain  some  phenomenon  or  event  m,  where  m  is  represented  by  one  or  more  propositional 
beliefs  in  domain  knowledge.  Domain  knowledge,  e.g.,  Da,  Db,  or  any  subset  thereof,  may  be  in 
zero  or  more  explanations  of  phenomena. 

Suppose  an  agent  has  explained  the  phenomenon  m  =  “the  body  receives  oxygen  from  the 
blood”  with  an  explanation  xa  that  uses  the  knowledge  in  Da.  Suppose  also  that  the  agent  has 
decided  that  xa  is  presently  the  best  account  it  has  of  how  in  happens.  Using  tenninology  from 
abductive  reasoning,  we  call  xa  the  best  explanation  (Peirce,  1958)  or  preferred  explanation  for 
m,  since  other  inferior  explanations  may  exist. 

Now  suppose  the  agent  reads  several  sentences  of  a  textbook  passage  and  has  acquired  the 
knowledge  Db,  while  still  entertaining  its  previous  account  Da.  When  the  agent  uses  the  new 
knowledge  in  Dh  to  explain  m,  a  new  explanation  xi,  is  created  for  m,  and  we  say  that  xa  and  jq, 
now  compete  to  explain  m.  Explanations  such  as  xa  and  x/,  are  persistent  structures,  and  are  used 
to  compartmentalize  and  contextualize  infonnation.  This  means  that  the  new  infonnation  Db 
does  not  replace  the  existing  infonnation  Da;  rather,  the  inconsistent  beliefs  in  Da  and  Db  coexist 
simultaneously.  If  the  agent  compares  competing  explanations  xa  and  x/,  and  determines  that  the 
new  explanation  x&  is  better  than  the  presently  preferred  explanation  xa  (e.g.,  because  it  contains 
new  information  from  a  trusted  source),  Xb  will  replace  xa  as  the  agent’s  preferred  explanation  for 
m.  This  exemplifies  part  of  Claim  3:  that  the  agent  constructs  explanations  and  evaluates 
preferences  as  a  mechanism  of  change. 

5  Neither  D„  nor  Db  is  a  complete,  correct  account  of  the  human  circulatory  system,  but  both  represent  mental 
models  of  the  circulatory  system  used  by  middle-school  students  (Chi  et  al.,  1994a). 

6  We  define  justifications  in  Chapter  3. 


29 


The  decision  to  replace  xa  to  xi,  as  the  preferred  explanation  for  m  has  broad  implications  for 
the  agent.  For  instance,  if  the  agent  must  describe  the  mechanism  of  m  on  an  exam,  it  can  access 
its  preferred  explanation  Xb  for  m  to  construct  a  solution.  Alternatively,  suppose  the  agent  must 
explain  a  novel  phenomenon  m  ’  (e.g.,  the  effect  of  a  collapsed  lung  on  the  brain’s  oxygen).  To 
do  this,  the  agent  uses  similarity-based  retrieval  (Forbus  et  ah,  1995)  to  retrieve  the  relevant 
phenomenon  m,  accesses  the  best  explanation  x*,  and  then  uses  the  domain  knowledge  Db  within 
Xb  to  explain  m  ’.  If  domain  knowledge  Di,  is  used  within  the  preferred  explanation  xc  for  the  new 
phenomenon  m  then  the  set  Db  of  domain  knowledge  now  supports  the  preferred  explanations 
of  both  m  and  m  ’  and  the  set  Da  supports  neither  (though  it  shares  some  of  the  knowledge  of  D/,). 
Via  this  system  of  preferential  retrieval  and  reuse  of  explanations,  beliefs  are  used  and 
propagated  according  to  whether  they  participate  in  preferred  explanations.  When  a  belief  is  no 
longer  a  member  of  a  preferred  explanation  (e.g.,  the  belief  “all  blood  leaving  the  heart  flows 
directly  to  the  body”  in  Da ),  it  is  effectively  inert.  This  constitutes  a  mental  model 
transfonnation.  Chapters  6  and7  describe  simulations  of  mental  model  transformation  via 
explanation  revision. 

Claim  3  also  states  that  category  revision  occurs  by  the  same  mechanism  of  change. 
Consider  a  different  example:  an  agent  believes  that  (1)  all  objects  have  a  quantity  q  which  has  a 
spatial  directional  component  (e.g.,  an  object  can  have  leftward  q,  downward  q,  etc.),  (2)  an 
object  moves  if  and  only  if  its  q  is  in  the  direction  of  motion,  and  (3)  an  object  stops  moving  in  a 
direction  if  its  q  loses  that  directional  component.  Consequently,  q  is  a  conflation  of  weight  and 
momentum,  similar  to  some  concepts  of  force  found  in  the  literature  (Ioannides  &  Vosniadou, 
2002).  Suppose  the  agent  watches  a  foot  strike  a  large  ball  and  then  immediately  observes  the 
foot  strike  a  smaller  ball,  which  moves  a  greater  distance.  The  agent  compares  the  two  events, 
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and  detennines  that  the  present  specification  of  q  cannot  explain  the  discrepancy  in  the  distances 
the  balls  travel.  To  resolve  this  explanation  failure,  the  agent  considers  that  q  might  be  an 
acquired  quantity  such  that  one  object  can  transfer  some  amount  of  q  to  another  by  touch  or 
collision  (rather  than  shifting  the  direction  of  existing  q,  previously),  and  that  the  transfer  rate  of 
q  is  inversely  proportional  to  the  size  of  the  recipient.  This  results  in  a  new  quantity 
specification  qa  which  is  a  revision  of  the  previous  quantity  specification  <7. 7 

The  agent  can  use  its  new  quantity  specification  qa  to  explain  why  the  large  and  small  balls 
travel  different  distances.  As  in  the  mental  model  transformation  example,  the  agent  formulates 
new  explanations  with  qa  rather  than  q,  and  embeds  qa  into  preferred  explanations  of  new 
phenomena.  Further,  the  agent  can  find  previous  phenomena  explained  with  q  and  explain  them 
using  qa.  This  process  of  retrospective  explanation  embeds  qa  in  additional  preferred 
explanations  and  promotes  conceptual  change.  As  in  the  circulatory  system  example,  the 
previously-existing  knowledge  loses  its  likelihood  of  becoming  retrieved  and  reused,  and  might 
eventually  become  inert. 

In  our  model,  category  revision  and  mental  model  transfonnation  are  different  types  of 
conceptual  change  because  they  involve  different  types  of  changes  to  conceptual  knowledge: 
category  revision  revises  an  element  (e.g.,  q)  within  domain  knowledge,  and  mental  model 
transfonnation  recombines  existing  elements  of  domain  knowledge  (e.g.,  mental  model  parts  and 
propositional  beliefs)  into  different  aggregates.  Importantly,  both  of  these  changes  are 
propagated  throughout  the  knowledge  base  using  the  same  explanation-based  process.  So  while 
both  of  these  types  of  conceptual  change  result  in  different  changes  to  memory,  they  share  a 
common  propagation  mechanisms  and  underlying  memory  structure.  This  completes  our 
discussion  of  the  third  claim. 

7  Chapter  8  shows  how  heuristics  can  be  used  to  revise  quantities  upon  encountering  anomalies. 
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To  summarize,  constructing  and  evaluating  explanations  is  the  primary  mechanism  of 
conceptual  change  in  our  cognitive  model.  We  have  very  abstractly  sketched  how  this  occurs, 
but  this  does  not  qualify  as  a  theory  or  model  of  conceptual  change  in  itself.  In  later  chapters,  we 
describe  the  representations  and  algorithms  -  including  models  of  explanation  construction  and 
explanation  evaluation  -  that  underlie  this  specification.  As  abstract  as  it  is,  our  above  sketch  of 
the  two  types  of  conceptual  change  does  make  a  number  of  high-level  psychological  assumptions 
that  are  worth  addressing  before  we  discuss  the  details. 

1.2  Psychological  assumptions  of  our  model  of  conceptual  change 

We  summarize  our  assumptions  in  Figure  3.  Some  of  these  assumptions  are  supported  (s)  by  the 
literature;  these  serve  as  psychological  constraints  for  cognitive  modeling.  Assumptions  that  are 
unsupported  (w)  by  the  literature  serve  as  psychological  predictions  of  this  cognitive  model  that 
might  be  confirmed  by  later  psychological  experimentation.  Finally,  assumptions  that  are 
inconsistent  (/)  with  the  literature  are  limitations  and  opportunities  for  future  improvement.  We 
discuss  each  of  these  assumptions  next. 
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1 .  Human  experts  and  novices  can  mentally  simulate  physical  phenomena 
qualitatively.  (5) 

2.  When  a  person  uses  a  mental  model  to  reason  about  the  world,  the  object(s) 
described  by  the  mental  model  generally  correspond  to  real-world  objects.  (5) 

3.  People  represent  causal  influences  between  quantities  in  their  intuitive 
knowledge  about  the  world.  (5) 

4.  Regardless  of  how  they  are  organized  within  theories  and  explanations,  human 
mental  models  can  be  represented  as  reusable  parts.  (5) 

5.  People  store  mental  models  in  long-tenn  memory.  (5) 

6.  People  can  leam  and  reason  with  propositional  beliefs.  (5) 

7.  People  can  evaluate  competing  explanations  for  a  single  phenomenon.  (5) 

8.  People  can  believe  two  inconsistent  beliefs  simultaneously  when  those  beliefs 
are  used  to  explain  different  phenomena.  (5) 

9.  After  explaining  a  phenomenon,  people  generally  retain  the  best  explanation 
for  the  phenomenon  in  long-term  memory,  but  may  not  discard  other 
explanations,  (u) 

10.  When  explaining  a  novel  phenomenon,  people  often  retrieve  a  similar, 
previously-understood  phenomenon  to  aid  in  explanation.  (5) 


11. 

12. 

13. 

DC 

{  ,4. 

- 

15. 


16. 


People  use  analogy  to  generalize  the  common  structure  of  observations.  (5) 

People  can  revise  the  ontological  properties  of  a  quantity  concept.  (5) 

People  do  not  immediately  replace  concepts  through  conceptual  change; 
throughout  the  process,  people  have  access  to  both  the  old  and  new  knowledge 
(e.g.,  quantities  and  mental  models).  (5) 

People  can  change  how  they  explain  a  phenomenon.  (5) 

The  cognitive  processing  required  to  transition  away  from  a  misconception  is 
qualitatively  proportional  to  how  pervasively  the  misconception  was  previously 
used  in  explanations,  (a) 

By  (15),  some  misconceptions  are  more  resilient  to  change  than  others.  (5) 


Figure  3.  High-level  psychological  assumptions  of  our  cognitive  model,  organized  by 
tvpe.  Each  is  labeled  where  surmorted  bv  (5)  or  unsupported  bv  (u)  the  literature. 
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1.2.1  Assumptions  about  knowledge  representation 

Toulmin  (1972)  argued  that  that  the  term  “concept”  is  pervasively  used  and  ill-defined  in  the 
literature,  and  this  complaint  is  still  warranted  today.  As  Figure  1,  illustrates,  we  have  a  very 
specific  representation  of  conceptual  knowledge,  using  existing  knowledge  representation 
formalisms  in  AI.  Further,  our  claim  that  human  mental  models  can  be  simulated  with 
compositional  model  fragments  has  been  argued  previously  (e.g.,  Forbus  &  Gentner,  1997). 

Still,  we  review  each  of  the  related  assumptions,  since  the  mental  model  literature  has  not 
reached  consensus  on  knowledge  representation. 

In  support  of  assumption  #1 :  use  of  qualitative  reasoning,  there  is  evidence  that  novices  and 
experts  alike  often  reason  with  incomplete  and  imprecise  qualitative  knowledge,  especially  in 
situations  of  informational  uncertainty  (Trickett  &  Trafton,  2007).  This  supports  our  choice  of 
using  compositional  qualitative  models  to  simulate  human  mental  models.  We  describe 
qualitative  reasoning  in  more  detail  in  Chapter  3. 

The  tenn  “mental  models”  (Gentner  &  Stevens,  1983;  Gentner,  2002)  has  been  widely  used 
to  describe  representations  of  domains  or  situations  that  support  everyday  explanation  and 
prediction.  Nersessian  (2007)  provides  generally-accepted  criteria  for  psychological  mental 
model-based  reasoning: 

•  It  involves  the  construction  or  retrieval  of  a  mental  model. 

•  Inferences  are  derived  through  manipulation  of  the  mental  model. 

Vosniadou  &  Brewer  (1994)  note  additional  characteristics  of  mental  models: 
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•  The  observable  and  unobservable  objects  and  states  of  the  world  that  a  mental  model 
represents  are  often  analogs  of  real-world  objects.  (Supports  assumption  #2:  entities  and 
states  in  mental  models  have  real-world  correspondences .) 

•  Mental  models  provide  explanations  of  physical  phenomena. 

•  Many  mental  models  may  be  manipulated  mentally  or  “run  in  the  mind’s  eye”  to  make 
predictions  about  the  outcomes  of  causal  states  in  the  world. 

One  representation  distinction  noted  by  Markman  and  Gentner  (2001)  is  between  logical 
mental  models  and  causal  mental  models.  In  the  logical  mental  model  account  (e.g.,  Johnson- 
Laird,  1983),  mental  models  are  logical  constructs  in  working  memory.  In  this  view,  mental 
models  are  constructed  on-the-spot,  involving  only  knowledge  in  working  memory  about  the 
local  problem-at-hand.  This  approach  has  been  criticized  for  failing  to  simulate  human 
reasoning  that  is  captured  by  propositional  reasoning  (Rips,  1986).  This  definition  of  mental 
models  is  inconsistent  with  assumption  #5:  mental  models  in  LTM. 

In  the  causal  mental  model  account  (e.g.,  Gentner  &  Stevens,  1983),  the  entities  and 
quantities  of  a  mental  model  correspond  to  observable  and  unobservable  entities  and  quantities  in 
a  causal  system  (supporting  assumption  #2:  entities  and  states  in  mental  models  have  real-world 
correspondences ).  Further,  causal  mental  models  draw  on  long-term  domain  knowledge 
(supporting  assumption  #5:  mental  models  in  LTM).  In  this  dissertation,  we  use  the  tenn  “mental 
model”  to  refer  to  this  causal  account  of  mental  model. 

In  support  of  assumption  #3:  representing  quantity  influences,  there  is  evidence  that  even 
infants  have  knowledge  about  the  relationship  between  quantities.  For  example,  6.5-month-old 
infants  look  reliably  longer  -  indicating  a  violation  of  expectation  -  when  a  small  object  A  strikes 
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a  second  object  B  and  causes  B  to  roll  farther  then  when  a  large  object  C  hits  the  same  object  B. 
This  suggests  that  infants  understand  an  indirect  influence  between  quantities:  the  distance 
something  travels  is  qualitatively  proportional  to  the  size  of  the  object  that  strikes  it  (Baillargeon, 
1998).  It  is  safe  to  assume  that  humans  have  a  tendency  to  represent  influences  between 
quantities,  even  prior  to  fonnal  instruction,  but  not  prior  to  experience.  We  describe  direct  and 
indirect  influences  in  depth  in  Chapter  3. 

Our  assumption  #4:  piecewise  mental  model  representation  is  a  key  argument  of 
compositional  accounts  of  mental  models  (Collins  &  Gentner,  1987)  and  of  the  knowledge  in 
pieces  (hereafter  KiP)  view  of  conceptual  change  (diSessa,  1993;  diSessa  et  ah,  2004).  KiP 
claims  that  conceptual  reasoning  involves  the  coordination  of  various  phenomenological 
primitives  which  include  rules,  constraints,  and  qualitative  proportionalities  such  as  larger 
objects  have  greater  momentum.  Under  KiP,  conceptual  change  involves  revising  a  piece  of 
knowledge  or  recombining  them  to  generate  new  explanations. 

The  plausibility  of  assumption  #4  is  not  limited  to  the  KiP  perspective.  For  example, 
researchers  who  oppose  KiP  and  advocate  a  more  coherent  account  of  human  mental  models 
(e.g.,  Vosniadou  &  Brewer,  1992;  Vosniadou,  1994;  Ioannides  &  Vosniadou,  2002)  describe  the 
existence  of  synthetic  mental  models.  In  this  coherence-based  account,  synthetic  mental  models 
are  the  result  of  partially  revising  an  intuitive  (i.e.,  pre-instructional)  mental  model  to  accord 
with  scientific  knowledge.  One  example  of  a  synthetic  model  found  by  Vosniadou  &  Brewer 
(1992)  is  a  flat,  disc-shaped  earth,  formed  by  students  who  assimilate  the  knowledge  “the  earth  is 
round”  into  an  intuitive  model  of  a  flat,  rectangular  earth.  If  we  can  identify  components  of  this 
synthetic  model  as  intuitive  (e.g.,  the  flatness  of  the  disc-earth)  and  other  aspects  as  instructional 
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(e.g.,  the  roundness  of  the  disc-earth),  then  we  can  say  that  even  though  human  mental  models 
might  be  stored  coherently,  they  are  at  least  plausibly  represented  as  smaller  components. 

Representation  assumption  #6:  people  reason  with  propositional  beliefs  is  widely  (though 
not  universally)  accepted  in  cognitive  science  (Forbus  &  Gentner,  1997;  Chi,  2008;  Vosniadou, 
1994;  but  see  Glenberg  et  ah,  1999;  Thelen  and  Smith,  1994).  This  is  supported  by  studies  of 
deductive  reasoning  (e.g.,  Rips,  2001)  and  accounts  of  conceptual  change  (e.g.,  Chi,  2008; 
Vosniadou,  1994;  diSessa,  1993).  This  does  not  mean  that  propositional  beliefs  are  always  easy 
to  change;  to  the  contrary,  Vosniadou  (1994)  argues  that  presuppositions  -  prevalent 
propositional  beliefs  such  as  “things  that  are  unsupported  from  beneath  fall  down”  -  are  the  most 
difficult  to  change. 

1.2.2  Assumptions  about  memory  and  knowledge  organization 

We  now  discuss  psychological  assumptions  about  how  knowledge  is  evaluated  and  organized  in 
long-tenn  memory. 

Assumption  #7:  evaluating  competing  explanations  is  supported  by  the  literature.  Chinn  et 
al.  (1998)  propose  that  everyday  people  evaluate  explanations  based  on  the  criteria  of  empirical 
accuracy,  scope,  consistency,  simplicity,  and  plausibility,  and  scientists  evaluate  scientific 
explanations  by  the  additional  criteria  of  precision,  formalisms,  and  fruitfulness.  Lombrozo 
(2011)  mentions  additional  explanatory  virtues  by  which  people  judge  explanations,  including 
coverage  of  observations,  goal  appeal,  and  narrative  structure. 

There  is  evidence  in  the  literature  for  assumption  #8:  simultaneous  inconsistent  beliefs.  For 
example,  Collins  and  Gentner  (1987)  found  that  novices  often  use  mutually  inconsistent  mental 
models  of  evaporation  and  condensation  to  explain  different  phenomena.  While  novice 
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explanations  are  locally  consistent  for  explaining  individual  phenomena  (e.g.,  hot  water 
evaporating  in  a  refrigerator,  seeing  your  breath  in  the  winter)  they  may  be  globally  inconsistent. 
Since  these  inconsistent  mental  models  are  narrowly  compartmentalized  by  phenomena,  the 
learner  may  never  realize  these  inconsistencies  (Gentner,  2002). 

Our  model  assumes  that  people  store  explanations  for  phenomena  -  including  justification 
structure  -  in  long-tenn  memory  (assumption  9).  This  is  probably  a  case  where  the  model’s 
assumptions  are  too  strong.  Other  theories  of  conceptual  change  suggest  that  explanations  are 
organizational  structures  (e.g.,  Carey,  1985),  but  it  seems  unlikely  that  people  retain  all  of  the 
justification  structure  of  their  explanations.  Evidence  suggests  that  if  people  do  retain 
justifications  for  their  beliefs  (and  by  extension,  the  entire  explanation(s),  according  to 
assumption  #9)  they  tend  to  retain  a  belief  even  after  the  supporting  evidence  is  discredited. 

Ross  and  Anderson  (1982)  discuss  several  experiments  that  (1)  convinced  people  of  a  belief 
(e.g.,  the  professional  performance  of  firefighters  positively  or  negatively  correlates  with  their 
score  on  a  paper  and  pencil  test)  and  then  (2)  debriefed  the  subject  to  communicate  that  the 
initial  evidence  was  fictitious  -  and  that  in  fact,  the  opposite  was  true.  In  these  studies,  the 
subject  retained  significant  confidence  in  the  belief  after  the  evidence  was  discredited, 
suggesting  that  evidence  is  not  required  for  retaining  a  belief.  In  a  similar  study  (Davies,  1997) 
people  either  read  high-quality  explanations  for  the  outcomes  of  an  event  or  constructed 
explanations  for  themselves  for  the  same  outcomes,  based  on  the  same  evidence.  After  all  of  the 
evidence  was  discredited,  subjects  who  constructed  explanations  for  themselves  were 
significantly  more  likely  to  retain  the  unsupported  belief  than  those  that  read  a  high-quality 
explanation. 
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Since  people  do  not  always  rely  on  the  evidence  for  their  beliefs  during  everyday  belief 
revision,  they  might  not  encode  all  of  the  justifications  between  beliefs  and  supporting  evidence. 
Hannan  (1986,  pp.  41)  argues  that  “[i]t  stretches  credulity  to  suppose  people  always  keep  track 
of  the  sources  of  their  beliefs  but  often  fail  to  notice  when  the  sources  are  undennined.”  This  is  a 
philosophical  appeal  to  the  simplest  explanation,  which  is  part  of  a  larger  debate  in  the 
philosophy  and  belief  revision  literature  between  the  foundations  theory  and  the  coherence 
theory.  Though  philosophical  appeals  to  this  question  do  not  provide  us  with  empirical  evidence 
for  our  assumption,  they  help  illustrate  the  dilemma. 

According  to  the  foundations  theory  (e.g.,  Doyle,  1992),  justifications  for  beliefs  are 
retained,  and  a  rational  agent  holds  a  belief  if  and  only  if  it  is  justified.  8  If  all  justifications  for  a 
belief  are  invalidated,  that  belief  is  invalidated,  and  the  justifications  it  supports  are  also 
invalidated,  resulting  in  a  possible  chain-reaction.  Conversely,  under  the  coherence  theory  (e.g., 
Gardenfors,  1990),  justifications  for  beliefs  are  not  retained  in  memory  -  if  the  agent  is  no  longer 
justified  in  believing  something  (i.e.,  there  is  no  more  evidence),  the  belief  is  still  retained 
insofar  as  it  is  consistent  with  other  beliefs.  Put  simply,  the  foundations  theory  states  that  beliefs 
are  held  only  if  there  is  rationale,  and  the  coherence  theory  states  that  once  a  belief  is  held,  it  is 
only  removed  if  there  is  rationale. 

Our  cognitive  model  does  not  strictly  adhere  to  the  foundations  theory,  since  beliefs  are  not 
necessarily  retracted  when  they  lose  support  (i.e.,  they  may  become  assumptions  if  they  are  used 
to  support  other  beliefs),  but  it  does  rely  on  justification  structure  of  explanations  to  organize 
beliefs.  Other  systems  that  record  justification  structure  (e.g.,  Doyle  and  Wellman,  1990)  also 
retain  unjustified  beliefs  when  convenient. 

8  Belief  revision  according  to  the  foundations  theory  is  exemplified  by  Truth  Maintenance  Systems  (Forbus  &  de 
Kleer,  1993),  discussed  in  Chapter  3.  At  approaches  that  track  justifications  of  knowledge  generally  encode 
justifications  for  all  premise  beliefs.  Consequently,  observations  are  intrinsically  justified. 
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Since  we  have  no  hard  evidence  to  support  assumption  #9,  our  model  might  rely  too  heavily 
on  the  presence  of  explanations  in  long-term  memory.  We  describe  some  ideas  for  altering  the 
model  to  remove  this  assumption  in  section  9.4 

There  is  indirect  evidence  in  the  literature  for  assumption  #10:  retrieval  of  a  similar, 
understood  phenomenon.  During  problem  solving,  people  are  often  reminded  of  prior  problems; 
however,  these  remindings  are  often  based  on  surface-level  similarities  between  problems  rather 
than  deeper  relational  similarities  (Gentner,  Ratterman,  &  Forbus,  1993;  Ross,  1987).  On  the 
rare  occasions  that  they  retrieve  a  useful  analog  in  a  distant  domain,  people  can  use  these  cases 
via  analogy  to  the  present  problem  to  find  a  solution  (Gick  &  Holyoak,  1980).  There  is  evidence 
that  people  have  some  success  in  retrieving  and  utilizing  similar  problems  in  the  domains  of 
mathematics  (Novick,  1988)  and  computer  programming  (Faries  &  Reiser,  1988).  It  is  therefore 
a  safe  assumption  that  people  are  reminded  of  similar  phenomena  when  faced  with  a  new 
phenomenon  to  explain,  especially  when  they  have  surface-level  similarity.  This  still  allows  for 
the  possibility  that  nothing  may  be  retrieved,  e.g.,  when  episodic  memory  is  empty  or  when  no 
previously-encountered  phenomena  are  similar.  The  simulation  described  in  Chapter  8  uses 
heuristics  to  generate  new  domain  knowledge  in  these  instances. 

1.2.3  Assumptions  about  learning 

Our  claim  that  people  can  induce  mental  models  from  observations  assumes  that  people  use 
analogy  to  generalize  (assumption  1 1).  It  also  makes  assumptions  regarding  how  people 
represent  their  observations,  which  we  address  later.  There  is  substantial  evidence  that  both 
adults  and  children  use  analogical  generalization  to  leam  categories  and  relationships  over  very 
few  examples.  For  instance,  4-year-olds  can  learn  the  abstract  relational  categories  monotonicity 
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and  symmetry  with  only  a  few  examples,  if  done  correctly,  which  is  elegantly  explained  by 
analogical  generalization  (Kotovsky  &  Gentner,  1996).  Further,  Gentner  and  Namy  (1999) 
found  that  when  4-year-olds  are  provided  a  single  example  of  a  nonsense  category  such  a  “dax,” 
and  asked  to  find  another  dax,  they  choose  a  perceptual  (i.e.,  surface-level)  match;  however, 
when  given  two  training  examples  and  encouraged  to  compare,  they  pick  a  conceptual  (i.e., 
relational)  match.  This  suggests  that  the  act  of  comparing  as  few  as  two  examples  can  induce  a 
new  category  hypothesis,  which  is  consistent  with  analogical  generalization. 

Our  model  assumes  that  people  can  make  ontological  revisions  to  their  concepts 
(assumption  12).  Ontological  revision  is  a  central  component  of  Chi’s  (2005;  2008)  theory  of 
conceptual  change.  Chi  calls  ontological  revision  a  categorical  shift,  whereby  a  category  such  as 
“Whale”  changes  lateral  position  in  a  hierarchical  ontology  of  categories,  e.g.,  from  a 
subordinate  position  of  “Fish”  to  a  subordinate  position  of  “Mammal.”  The  more  distant  the 
initial  and  final  position  of  a  concept,  the  more  difficult  the  conceptual  change.  Two  notable 
examples  are  as  follows:  (1)  shifting  “Force”  from  its  intuitive  position  under  “Substance” 
(Reiner  et  ah,  2000)  to  a  lateral  “Constraint-based  interaction”  position  (Chi  et  ah,  1994b);  and 
(2)  shifting  “Diffusion”  from  beneath  “Direct  process”  to  beneath  “Emergent  process”  (Chi, 
2005).  Our  model  does  not  rely  on  these  specific  ontologies,  but  it  does  assume  that  people  are 
capable  of  making  ontological  changes,  and  this  assumption  seems  safe. 

Since  our  model  of  conceptual  change  involves  incrementally  transitioning  between 
theories,  we  rely  on  assumption  #13:  theories  are  not  immediately  replaced.  For  example,  it 
cannot  be  the  case  that  learning  a  new  and  credible  theory  of  dynamics  causes  a  person  to 
immediately  forget  the  inconsistent  beliefs  and  models  of  a  previous  theory  of  dynamics.  AI 
algorithms  for  coherence-based  belief  revision  (e.g.,  Alchurron  et  ah,  1985)  immediately  remove 
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inconsistent  beliefs  in  this  fashion.  Similarly,  dependency-directed  backtracking  algorithms  for 
truth  maintenance  (e.g.,  Doyle,  1979;  Forbus  &  de  Kleer,  1993)  immediately  retract  assumptions 
to  retain  consistency.  Since  we  assume  that  people  can  hold  contradictory  beliefs  (assumption 
8),  these  algorithms  are  not  used  in  our  conceptual  change  model. 

The  literature  supports  the  assumptions  that  competing  theories  can  coexist,  psychologically. 
In  their  constructivist  view  of  conceptual  change,  Smith,  diSessa,  and  Roschelle  (1994)  note  that 
as  people  accrue  theories,  they  evaluate  them  with  respect  to  their  effectiveness  in  understanding 
and  manipulating  the  world.  Under  this  view,  nonscientific  theories  can  be  used  productively 
even  when  scientifically-correct  theories  are  available.  Similarly,  students  often  learn  to  use 
quantitative  Newtonian  theories  of  force  while  still  operating  with  their  qualitative 
misconceptions  of  force  (Clement,  1985;  Hestenes  et  ah,  1992).  The  Newtonian  laws,  e.g.,  F  = 
ma  can  also  be  used  for  qualitative  reasoning.  For  instance,  all  else  being  equal,  increasing  mass 
must  increase  force  (i.e.,  force  is  qualitatively  proportional9  to  mass)  and  increasing  force  must 
increase  acceleration  (i.e.,  acceleration  is  qualitatively  proportional  to  force).  The  predictions  of 
this  qualitative  Newtonian  theory  of  force  are  inconsistent  with  most  students’  intuitive 
qualitative  models  of  force.  Despite  their  joint  applicability,  students  might  contextualize 
Newtonian  and  intuitive  models  of  force  separately,  so  that  Newtonian  models  are  used  in 
quantitative  classroom  problem-solving  and  intuitive  models  are  used  in  everyday  qualitative 
reasoning  contexts.  This  micro-contextualization  of  mental  models  is  not  a  new  idea;  Collins 
and  Gentner  (1987)  suggest  that  this  is  the  reason  novices  are  able  to  reason  with  inconsistent 
knowledge,  often  without  detecting  an  inconsistency. 

As  described  above,  our  model  of  conceptual  change  involves  incrementally  shifting 
phenomena  from  explanations  that  use  a  superseded  theory  to  explanations  that  use  a  preferred 
9  Qualitative  proportionalities  are  described  in  greater  detail  in  section  3.2. 
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theory.  Consequently,  we  make  the  assumption  #14:  phenomena  can  be  re-explained.  This  is 
not  a  contentious  claim  -  studies  that  contain  a  pretest  and  posttest  to  measure  learning  (e.g., 
about  the  human  circulatory  system  in  Chi  et  al.,  1994a)  or  an  interview  during  which 
explanations  change  (e.g.,  about  the  changing  of  the  seasons  in  Sherin  et  al.,  in  press) 
demonstrate  clearly  that  people  can  change  their  explanations  for  phenomena.  This  may  not  be 
sufficient  to  show  that  people  retain  all  of  the  justifications  for  their  explanation  (assumption  #9), 
but  they  do  associate  the  phenomenon  with  new  -  or  at  least,  different  -  supporting  knowledge. 

Since  our  computational  model  relies  on  the  gradual  shift  of  explanatory  support,  it  follows 
that  the  more  explanations  include  a  theory,  the  more  computations  are  necessary  for  the  agent  to 
transition  away  from  said  theory.  In  other  words,  we  predict  that  the  more  pervasive  a 
misconception  is,  the  more  processing  is  required  to  overcome  it  (assumption  #15).  There  is  no 
direct  support  of  this  in  the  literature,  but  this  is  consistent  with  the  idea  that  productive  theories 
are  more  pervasive  and  robust  to  change  (Smith,  diSessa,  and  Roschelle,  1994). 

If  we  assume  that  some  misconceptions  require  more  processing  to  overcome  than  others 
(assumption  #15)  then  we  arrive  at  assumption  #16:  some  theories  are  more  resilient  to  change. 
As  mentioned  above,  Vosniadou  (1994;  Vosniadou  &  Brewer,  1992,  1994)  makes  a  distinction 
between  mental  models  and  presupposition  beliefs  that  constrain  these  mental  models.  For 
example,  a  mental  model  of  a  flat  earth  is  constrained  by  the  presupposition  “things  that  are 
unsupported  from  beneath  fall  down.”  In  Vosniadou’s  theory,  these  presuppositions  are  more 
resilient  to  change  than  the  mental  models  they  constrain.  Further,  de  Leeuw  (1993)  and  Chi 
(2000)  argue  that  the  perseverance  with  which  a  belief  is  held  increases  with  the  number  of 
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consequences  the  belief  has  in  a  network  therein. 10  In  our  model,  these  networked  consequences 
of  a  belief  correspond  roughly  to  the  explanations  that  include  said  belief.  Our  definition  of 
theory  includes  a  set  of  beliefs,  so  this  supports  the  assumption  that  theories  vary  in  their 
resilience  to  change. 

Researchers  have  also  characterized  how  people  resist  changing  their  beliefs.  People  use 
evasive  strategies  called  knowledge  shields  (Feltovich  et  al.,  2001)  to  ignore  anomalous  data,  and 
they  use  other  strategies  such  as  rejecting,  reinterpreting,  excluding,  and  holding  knowledge  in 
abeyance  (Chinn  &  Brewer,  1993;  1998)  to  resist  change.  In  the  event  that  people  do  revise  their 
beliefs,  they  frequently  make  minimal  changes  to  their  present  theory  rather  than  adopting  a  new 
theory  in  its  entirety  (Posner  et  al.,  1982;  Chinn  &  Brewer,  1998).  All  of  the  simulations 
described  below  are  biased  toward  minimizing  changes.  For  example,  Chapter  7  describes 
simulation  trials  that  leam  humanlike  misconceptions  by  choosing  to  use  concepts  (e.g.,  “heart”) 
known  prior  to  instruction  over  other  concepts  (e.g.,  “left-heart”)  that  were  acquired  by  formal 
instruction.  Since  the  focus  of  this  dissertation  is  conceptual  change,  we  are  more  interested  in 
simulating  the  successful  -  albeit  minimal  -  revision  of  beliefs  rather  the  avoidance  of  belief 
change;  however,  modeling  avoidance  strategies  is  an  interesting  opportunity  for  future  work. 

To  support  the  claims  of  this  dissertation,  we  have  developed  a  model  of  conceptual  change, 
implemented  the  model  on  a  cognitive  architecture,  and  conducted  four  simulation  experiments 
to  compare  the  trajectory  of  models  that  the  system  undergoes  to  the  trajectory  of  mental  models 
of  human  learners.  Our  computational  model  is  described  in  Chapter  4,  and  is  a  novel 
contribution  of  this  dissertation.  The  only  aspects  of  our  model  that  are  not  novel  contributions 
are  described  in  Chapter  3,  our  discussion  of  background  Al  technologies. 

10  It  is  unclear  whether  “consequences”  refer  to  logical  entailments  (in  the  philosophical  coherence-based  view  of 
belief  revision)  or  justifications  (in  the  philosophical  foundations  view  of  belief  revision)  supported  by  a  belief. 
Regardless,  this  supports  the  assumption  that  some  beliefs  are  more  resilient  to  change  than  others. 


In  the  next  chapter  we  describe  other  theories  of  conceptual  change  from  the  cognitive 
science  literature  and  discuss  areas  of  contention  between  them.  A  comparison  of  our  model 
with  these  previous  models  is  best  done  after  our  simulation  results  are  presented,  and  hence 


postponed  until  Chapter  9. 


45 


Chapter  2:  Other  theories  of  conceptual  change 


One  aim  of  the  cognitive  model  presented  in  this  dissertation  is  to  provide  insight  into  the 
cognitive  processes  underlying  human  conceptual  change.  This  warrants  a  discussion  of  existing 
theories  of  conceptual  change  and  the  areas  of  dispute  that  our  model  might  help  explicate. 

None  of  the  conceptual  change  theories  we  discuss  have  computational  models  that  capture  the 
full  spectrum  of  belief  changes  they  describe. 1 1  Consequently,  some  speculation  is  necessary  for 
detennining  each  theory’s  constraints  on  knowledge  representation,  memory  organization,  and 
revision  mechanisms. 

Despite  the  consensus  that  concepts  are  the  granularity  of  change  in  conceptual  change, 
different  theories  of  conceptual  change  make  different  assumptions  regarding  what  a  concept  is 
and  how  they  change  (diSessa  and  Sherin,  1998).  No  theory  uses  the  word  “concept”  exactly  as 
any  other  theory  does  or  exactly  as  we  do  in  our  cognitive  model  -  in  fact,  we  try  to  avoid  this 
vague  tenn.  Unfortunately,  we  must  use  “concept”  when  discussing  other  theories  to  avoid 
making  over-specific  assumptions  about  knowledge  representation,  since  the  theorists’ 
definitions  of  “concept”  may  be  intentionally  abstract  or  noncommittal. 

Ideally,  we  could  compare  our  model  of  conceptual  change  with  other  computational  models 
that  implement  these  four  theories:  they  could  learn  from  the  same  training  data  and  we  could 
monitor  their  progress  over  time  using  the  same  testing  data.  Unfortunately,  since  none  of  these 
theories  have  computational  models  that  capture  the  full  spectrum  of  belief  changes  they 
describe,  this  is  not  feasible.  The  other  possibility  is  to  modify  our  model  to  reflect  the  different 

11 INTHELEX  (Esposito  et  at.,  2000a;  Vosniadou  et  at.,  1998)  has  been  used  to  model  aspects  of  conceptual  change 
in  learning  the  meaning  of  “force”  using  logical  theory  refinement;  however,  the  system  is  given  multiple 
representations  of  “force”  concept  (e.g.,  "internal”  force  and  “acquired”  force)  from  the  start,  and  does  not  invent 
and  transition  between  representations  spontaneously  as  children  do,  according  to  loannides  and  Vosniadou  (2002). 
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aspects  of  these  theories.  This  is  not  feasible,  since  the  underlying  algorithms  and  knowledge 
representations  have  not  been  specified  for  these  theories.  Ultimately,  we  must  compare  our 
model  to  these  four  theories  by  abstracting  the  assumptions  and  behaviors  of  our  model  into  a 
psychological  theory  of  conceptual  change,  and  then  comparing  the  theories  at  that  level.  We 
save  this  discussion  for  Chapter  9,  after  we  have  presented  the  data  from  our  simulations. 

This  chapter  begins  by  describing  four  theories  of  human  conceptual  change  that  aim  to 
explain  how  people  adopt  new  beliefs  in  the  presence  of  conflicting  beliefs.  For  each  theory,  we 
discuss  its  underlying  assumptions  about  knowledge  representation,  memory  organization,  and 
mechanisms  of  change.  After  discussing  these  theories  of  conceptual  change,  we  discuss  some 
notable  areas  of  divergence  and  disagreement. 

2.1  Four  theories  of  conceptual  change 

The  conceptual  change  theories  we  discuss  include  the  theory-theory  of  conceptual  development, 
framework  theory,  categorical  shift,  and  knowledge  in  pieces.  Each  theory  makes  different 
commitments  to  the  representation  of  categories  and  mental  models,  the  organization  of  this 
knowledge  in  the  mind,  and  the  mechanisms  that  carry  out  change. 

2.1.1  Carey’s  theory-theory  of  conceptual  development 

We  begin  by  discussing  Susan  Carey’s  (1985;  1988;  2009)  theory  of  conceptual  change.  Carey’s 
theory  is  characterized  by  a  strong  appeal  to  the  history  of  science  to  draw  similarities  between 
conceptual  change  in  children  and  in  the  scientific  community.  It  also  relies  on  Kuhn’s  (1962) 
notion  of  incommensurability  between  conceptual  systems.  Incommensurability  is  a  relation  that 
holds  between  the  languages  of  two  theories.  Two  conceptual  systems  (i.e.,  theories  with 


47 


propositional  beliefs,  categories,  and  models)  CSi  and  CS2  are  incommensurable  if  CSi  contains 
concepts  that  are  incoherent  from  the  point  of  view  of  CS2.  That  is,  the  beliefs,  laws,  and 
explanations  that  can  be  stated  in  CSi’s  language  cannot  be  expressed  in  the  language  of  CS2. 
The  presence  of  concepts  in  CSi  that  are  merely  absent  in  CS2  is  not  sufficient  for 
incommensurability. 

For  an  example  of  incommensurability,  consider  Jean  Buridean’s  theory  of  projectile 
dynamics  (based  heavily  on  Aristotelian  dynamics)  with  respect  to  Newtonian  projectile 
dynamics.  Buridean  and  Newtonian  dynamics  use  different  vocabularies  -  Buridean  uses  the 
concept  of  impetus,  and  Newton  uses  the  concept  of force.  The  Buridean  concept  of  impetus  is 
proportional  to  velocity,  so  an  impetus  in  the  direction  of  motion  sustains  an  object’s  velocity. 
Newtonian  net  force  is  proportional  to  acceleration,  so  a  non-zero  net  force  in  the  direction  of 
motion  increases  an  object’s  velocity.  Also,  an  object  moving  at  constant  velocity  has  a  constant 
impetus  (i.e.,  the  impetus  is  not  weakened  by  gravity  or  air  resistance)  in  Buridean  theory,  but  it 
has  a  zero  net  force  in  Newtonian  theory.  A  final  point  of  contrast  is  the  motion  of  bodies  on 
circular  paths.  Buridean’s  theory  states  that  circular  impetuses  sustain  the  circular  motion  of 
celestial  bodies.  In  some  ways,  this  is  a  simpler  explanation  than  accounting  for  the  tangential 
velocity  of  orbiting  bodies  with  inward  acceleration  due  to  the  curvature  of  space-time.  Carey’s 
examples  of  incommensurability  include  other  historical  examples  (e.g.,  the  source-recipient 
theory  of  heat  versus  the  caloric  theory  of  heat)  and  developmental  examples  (e.g.,  theories  of 
physics  with  and  without  weight  differentiated  from  density). 

Under  Carey’s  theory,  conceptual  change  involves  a  shift  from  a  conceptual  system  CSi  to 
an  incommensurable  conceptual  system  CS2.  Both  conceptual  systems  are  internally  coherent, 
stable,  and  symbolically  represented.  The  difficulty  of  achieving  conceptual  change  in  some 
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domains,  e.g.,  learning  to  differentiate  weight  from  density,  stems  from  this  incommensurability. 
When  novices  and  experts  hear  “weight,”  they  understand  something  different,  and  the 
corresponding  novice  and  expert  ideas  are  mutually  incoherent.  This  is  an  obstacle  for  effective 
communication  and  formal  instruction.  Since  children  must  acquire  the  scientific  account  CS2 
via  social  processes,  incommensurability  makes  conceptual  change  difficult. 

The  process  of  conceptual  change  must  therefore  create  representations  for  CS2  that  are 
qualitatively  different  from  those  in  CSi.  Carey  (2009)  argues  that  children  perform  Quinian 
bootstrapping  to  achieve  this.  Quine  (1960)  describes  bootstrapping  using  a  metaphor:  you  use  a 
ladder  to  build  a  platform  in  a  conceptual  system  until  the  platfonn  is  self-sustaining,  and  then 
you  kick  the  ladder  out  from  under.  In  the  case  of  historical  and  psychological  conceptual 
change,  the  symbols  that  represent  concepts  (e.g.,  weight  and  density )  are  used  as  placeholders 
for  developing  a  new  conceptual  system  CS2.  Processes  such  as  analogy  (e.g.,  Gentner  et  ah, 
1997),  model-based  thought  experimentation  (e.g.,  Nersessian,  2007),  and  abduction  are  used  to 
integrate  new  knowledge  and  support  observations  using  these  placeholder  symbols.  In  this 
manner,  placeholder  concepts  are  learned  together  and  gain  meaning  relative  to  each  other.  This 
bootstrapping  process  is  iterative,  and  through  successive  rounds  of  analogy,  abduction,  and 
model-based  reasoning,  the  concepts  in  CS2  acquire  meaning  and  are  used  to  explain  real-world 
phenomena. 

2,1.2  Vosniadou’s  framework  theory 

Like  Carey’s  theory-theory  of  conceptual  development,  Vosniadou’s  (2002;  1994;  Vosniadou 
and  Brewer,  1992;  1994;  Ioannides  and  Vosniadou,  2002)  theory  posits  that  novices  have  an 
internally  coherent  intuitive  understanding  of  the  world  that  is  subject  to  modification  and  radical 
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revision.  In  place  of  Carey’s  conceptual  systems,  Vosniadou  uses  the  term  framework  theories. 
Children’s  framework  theories  are  coherent  explanatory  systems,  but  they  lack  characteristics  of 
scientific  theories  such  as  systematicity,  abstractness,  social  nature,  and  metaconceptual  access 
(Vosniadou,  2007;  Ioannides  and  Vosniadou,  2002).  Embedded  within  framework  theory  are 
specific  theories  about  phenomena  (e.g.,  the  day/night  cycle)  and  entities  (e.g.,  the  earth). 

Specific  theories  are  also  referred  to  as  specific  explanations  (Ioannides  and  Vosniadou,  2002). 
Finally,  embedded  within  the  framework  theory  and  specific  theories  are  mental  models.  The 
embedded  nature  of  knowledge  refers  to  the  direction  of  constraint:  the  framework  theory 
constrains  the  specific  theories/explanations,  which  in  turn  constrain  the  mental  models 
(Vosniadou,  2002). 

Framework  theories  contain  presuppositions,  which  are  propositional  beliefs  that  are  learned 
from  observations  and  cultural  influences.  Each  presupposition  places  consistency  constraints  on 
the  specific  theories  embedded  within  the  framework  theory.  In  this  fashion,  presuppositions 
limit  the  space  of  allowable  specific  theories,  and  indirectly,  the  space  of  allowable  mental 
models.  For  example,  the  presupposition  “unsupported  objects  fall  down”  affects  the  specific 
theory  and  mental  model  of  the  earth,  since  a  spherical  earth  with  people  standing  on  the 
“bottom”  would  contradict  the  presupposition.  It  is  assumed  that  changing  a  specific  theory 
(e.g.,  of  the  shape  of  the  earth)  is  easier  than  retracting  presuppositions,  provided  the  new 
specific  theory  is  consistent  with  existing  presuppositions. 

In  Vosniadou’s  theory,  the  main  difficulty  of  conceptual  change  is  that  students  frequently 
assimilate  aspects  of  a  scientific  explanation  into  their  flawed  framework  theory  without 
sufficiently  revising  their  presuppositions.  In  these  cases,  learners  either  (1)  do  not  notice  the 
contradictions  between  the  new  information  and  their  presuppositions  and  explanations,  or  (2) 
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they  notice  contradictions  and  only  make  partial  (i.e.,  insufficient)  changes  to  their 
presuppositions  and  explanations.  Partial  revision  of  a  framework  theory  can  produce  new 
misconceptions  and  synthetic  models  (Vosniadou  and  Brewer,  1992;  1994;  Ioannides  and 
Vosniadou,  2002),  which  are  incorrect  mental  models  that  incorporate  both  intuitive  and 
scientific  components.  Consider  integrating  the  belief  “the  earth  is  round”  into  a  framework 
theory  that  contains  the  “unsupported  objects  fall  down”  presupposition  with  a  mental  model  of 
the  earth  as  a  flat  rectangle.  Since  presupposition  theories  are  more  resilient,  the  mental  model 
of  the  earth  is  the  easiest  component  to  revise,  and  the  earth  may  be  conceived  of  as  a  flat 
cylinder,  a  flattened  sphere,  or  even  a  hollow  sphere  with  a  flat  surface  inside  (Vosniadou  and 
Brewer,  1992).  The  mental  model  of  the  earth  is  thereby  constrained  by  the  presupposition,  and 
the  learner  must  revise  this  presupposition  to  acquire  the  correct  mental  model  of  the  earth. 

Changing  a  framework  theory  is  a  gradual  process,  driven  by  observation,  explanation,  and 
formal  education.  Throughout  the  process  of  learning  science,  aspects  of  scientific  theories  are 
assimilated  into  the  theories/explanations  embedded  within  the  student’s  framework  theory,  as 
well  as  into  the  framework  theory  itself.  This  yields  a  series  of  synthetic  models  which  approach 
the  correct  scientific  theory. 

2,1.3  Chi’s  categorical  shift 

Chi  and  colleagues  (Chi,  2008;  2005;  2000;  Reiner  et  ah,  2000;  Chi  et  al.,  1994b)  distinguish 
between  three  different  types  of  conceptual  change:  (1)  categorical  shift;  (2)  mental  model 
transformation;  and  (3)  belief  revision.  All  three  types  of  conceptual  change  require  that  some 
existing  knowledge  is  retracted  or  revised;  otherwise,  this  would  constitute  gap-filling, 
enrichment,  or  tabula  rasa  knowledge  acquisition.  We  discuss  each  of  these  types  of  change 
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according  to  Chi’s  theory,  including  the  type  of  knowledge  affected  and  the  mechanism  of 
change. 

Categorical  shift  was  briefly  discussed  in  the  previous  chapter.  It  involves  changing  a 
category’s  lateral  position  in  a  hierarchy  of  categories.  Chi’s  theory  assumes  the  existence  of 
multiple,  disconnected  ontological  trees  with  multiple  levels  of  inheritance.  For  instance,  Chi 
(2008)  identifies  three  ontological  trees:  (1)  “Entities”  which  has  subordinate  branches  “Concrete 
Objects”  and  “Substances;”  (2)  “Processes”  which  has  branches  “Direct,”  and  “Emergent,”  and 
(3)  “Mental  States”  with  branches  “Emotion”  and  “Intention.”  Each  tree  and  level  in  the 
hierarchy  ascribes  ontological  attributes  to  subordinate  categories,  e.g.,  a  lamp  (under  the 
“Artifacts”  branch  of  the  “Entities”  tree)  can  be  broken  and  a  hug  (under  the  “Events”  branch  of 
the  “Processes”  tree)  can  be  a  minute  long.  All  else  being  equal,  the  greater  the  lateral  distance 
between  two  categories,  the  more  their  ontological  attributes  differ.  This  distance  is  an  important 
consideration  for  Chi’s  theory,  because  shifting  a  category  from  one  place  in  the  hierarchy  to 
another  involves  changing  ontological  attributes  -  and  the  greater  the  distance,  the  greater  the 
change.  For  example,  “Fish”  and  “Mammals”  categories  both  share  the  close  ancestor  category 
of  “Animals”  under  the  “Entities”  tree.  These  categories  are  much  closer  than  “Substances” 
(under  the  “Entities”  tree)  is  to  “Constraint-Based  Interactions”  (under  the  “Processes”  tree). 
Shifting  “Whale”  from  “Fish”  to  “Mammals”  is  easier  (i.e.,  less  ontological  attributes  must 
change)  than  shifting  a  category  such  as  “Force”  from  “Substances”  (Reiner  et  ah,  2000)  to 
“Constraint-Based  Interactions.”  Categorical  shifts  are  incommensurate,  according  to  Carey’s 
(1985)  definition  of  incommensurability  (Chi,  2008). 

In  Chi’s  theory,  belief  revision  occurs  at  the  granularity  of  propositional  beliefs,  when  new 
information  is  logically  inconsistent  with  prior  beliefs.  For  example,  the  belief  “the  heart 
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oxygenates  blood”  is  inconsistent  with  the  new  infonnation  “only  the  lungs  oxygenate  blood.” 
When  this  occurs,  students  can  retract  the  existing  belief,  adopt  the  new  infonnation,  and 
continue  looking  for  inconsistencies.  In  reality,  students  generally  encounter  infonnation  that 
conflicts  less  directly  with  their  existing  beliefs,  such  as  “the  lungs  oxygenate  blood”  (i.e.,  still 
logically  permitting  the  heart  to  oxygenate  blood  also),  but  they  still  achieve  successful  belief 
revision  even  through  indirect,  implicit  conflict  (Chi,  2008). 

The  third  type  of  conceptual  change  in  Chi’s  theory  is  mental  model  transformation,  which 
is  a  special  case  of  belief  revision.  In  Chi’s  framework,  mental  models  are  organized  groups  of 
propositional  beliefs  which  can  predict  changes  and  outcomes  in  a  situation  or  system  such  as  the 
human  circulatory  system.  When  a  mental  model  is  flawed,  it  is  internally  coherent  but 
generates  incorrect  explanations  and  predictions.  Two  mental  models  (e.g.,  a  flawed  and  a 
correct  model)  are  in  conflict  when  they  make  mutually  inconsistent  predictions  and 
explanations,  even  though  the  beliefs  that  comprise  the  mental  models  might  not  be  explicitly 
contradictory.  Mental  models  are  ultimately  transfonned  by  the  revision  of  the  beliefs  that 
comprise  the  mental  model.  For  this  to  occur,  new  information  must  be  in  explicit  or  implicit 
conflict  with  the  beliefs  of  the  mental  model,  according  to  the  above  description  of  belief 
revision.  Some  false  beliefs  are  more  “critical”  than  others  (Chi,  2008)  in  that  they  discriminate 
between  a  flawed  and  correct  model.  For  example,  the  false  belief  “the  heart  oxygenates  the 
blood”  is  more  critical  to  explaining  and  predicting  the  behavior  of  the  circulatory  system  than 
the  false  belief  “all  blood  vessels  have  valves.” 

These  accounts  of  belief  revision  and  mental  model  transfonnation  do  not  involve 
incommensurability,  as  defined  by  Carey  (2009).  This  is  because  a  mental  model  shares  the 
same  symbolic  vocabulary  before  and  after  its  transformation,  even  though  entities  may  be  added 
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or  removed.  This  assumes  that  no  categorical  shift  occurs  during  mental  model  transformation. 
Only  categorical  shifts  involve  incommensurability,  since  the  vocabulary  changes  (i.e., 
categories  gain  and  lose  ontological  attributes). 

2.1.4  diSessa’s  knowledge  in  pieces 

The  Knowledge  in  Pieces  (KiP;  diSessa,  1988;  1993)  view  argues  that  intuitive  knowledge 
consists  of  a  multitude  of  inarticulate  explanatory  phenomenological  primitives  ( p-prims )  which 
are  activated  in  specific  contexts.  P-prims  are  phenomenological  in  that  (1)  they  provide  a  sense 
of  understanding  when  they  are  evoked  to  explain  or  interpret  a  phenomenon  and  (2)  they 
provide  a  sense  of  surprise  when  they  cannot  be  evoked  to  explain  a  situation  or  when  their 
predictions  are  inconsistent  with  reality.  They  are  primitive  in  that  they  are  generally  invoked  as 
a  whole  and  they  need  no  justification. 

P-prims  are  not  systematic  enough  to  be  described  individually  or  collectively  as  a  coherent 
theory  (diSessa  et  ah,  2004).  Furthermore,  a  student  may  operate  with  an  incoherent  set  of  p- 
prims  -  that  is,  his  or  her  p-prims  may  make  conflicting  predictions  about  a  situation,  similar  to 
Chi’s  (2008)  account  of  conflicting  mental  models.  This  is  in  direct  disagreement  with  the 
coherent  nature  of  Carey’s  conceptual  systems  and  Vosniadou’s  framework  theories. 

A  person  or  an  AI  system  with  incoherent  conceptual  knowledge  may  seem  unlikely  or 
unproductive  to  some,  but  according  to  KiP,  each  piece  of  knowledge  is  highly  contextualized 
with  respect  to  its  applicability  in  the  real  world  (diSessa  et  al.,  2004).  This  allows  people  to 
provide  coherent  explanations  for  individual  phenomena  despite  global  inconsistency. 12  If  a 

12  Collins  and  Centner  (1987)  provide  empirical  evidence  that  novices  can  narrowly  contextualize  inconsistent 
mental  models  to  achieve  internally  consistent  explanations,  but  their  account  of  mental  models  (see  Gentner  and 
Stevens,  1983)  is  not  committed  to  fragmentation  or  p-prims,  according  to  the  knowledge  in  pieces  perspective. 


54 


novice  generates  a  coherent  explanation,  it  is  an  effect  of  knowledge  contextualization  and  of  the 
process  of  explanation  construction;  it  is  not  a  hard  constraint  on  how  knowledge  is  organized  in 
memory. 

Since  KiP  does  not  involve  coherent  theories  or  conceptual  systems,  what  constitutes 
misconceptions  and  conceptual  change?  Smith,  diSessa,  and  Roschelle  (1993)  argue  that  the 
standard  model  of  misconceptions  -  that  students  hold  flawed  ideas  which  are  replaced  during 
instruction  -  conflicts  with  the  premise  of  constructivism  that  students  build  more  advanced 
knowledge  from  existing  understandings.  KiP  emphasizes  the  continuity  from  novice  to  expert 
knowledge  the  presence  of  intuitive  knowledge  within  expert  understanding  (Sherin,  2006). 
Consequently,  KiP  focuses  on  knowledge  refinement  and  reorganization  rather  than  replacement. 
Minstrell’s  (1982,  1989)  KiP  account  of  conceptual  change  involves  the  recombination  of 
explanatory  primitives  and  reuse  in  different  contexts.  Similarly,  diSessa  (1993)  describes  how 
the  contexts  and  priorities  of  p-prims  can  be  altered  to  change  how  learners  construct 
explanations  and  predictions  in  future  situations. 

Under  KiP,  the  difficulty  of  conceptual  change  is  a  factor  of  how  productive  a  piece  of 
knowledge  is  within  a  given  context.  Suppose  a  learner  has  previously  predicted  and  understood 
the  world  using  the  kinematic  “blocking”  p-prim  (diSessa,  1993)  whereby  an  object  such  as  a 
brick  blocks  a  moving  object  without  any  sense  of  effort  or  strain  (e.g.,  the  brick  does  not  visibly 
move,  bend,  or  compress).  The  more  productively  “blocking”  has  been  at  explaining  and 
predicting  within  a  class  of  phenomenon  (e.g.,  putting  objects  atop  rigid  surfaces,  thus 
preventing  the  object  from  moving  further  downward),  the  more  difficult  it  will  be  to  assign 
other  knowledge  besides  “blocking”  (e.g.,  of  normal  forces)  to  be  evoked  in  this  context. 
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2.2  Divergence  and  disagreement 

All  of  the  above  theories  aim  to  explain  documented  examples  of  conceptual  change,  so  there  is 
considerable  consensus  about  the  principles  and  constraints  of  conceptual  change.  There  are  also 
many  points  of  contention  among  the  four  theories  outlined  above.  We  discuss  four  topics  that 
lack  consensus  which  are  especially  relevant  to  our  cognitive  model:  (1)  what  counts  as 
conceptual  change;  (2)  revision  versus  addition  (3)  the  effect  of  explaining;  and  (4)  the  source  of 
coherence.  We  discuss  these  topics  with  regard  to  our  model  in  Chapter  9,  after  we  have 
described  the  simulations  that  exemplify  our  model’s  behavior. 

2.2.1  What  counts  as  conceptual  change 

Carey  (2009)  argues  that  incommensurability  is  a  necessary  condition  for  conceptual  change. 
This  necessarily  involves  creating  new  primitives,  symbols,  and  relationships  that  were  not 
coherently  describable  in  the  language  of  the  existing  conceptual  system.  Requiring 
incommensurability  sets  Carey’s  theory  apart  from  the  other  theories. 

Chi’s  (2008)  account  of  conceptual  change  includes  categorical  shift  (i.e.,  change  of  the 
incommensurable  sort)  and  also  commensurable  changes  such  as  mental  model  transformation 
and  belief  revision.  Similarly,  Vosniadou  (1994;  Vosniadou  and  Brewer,  1992;  1994;  Ioannides 
and  Vosniadou,  2002)  considers  the  revision  of  mental  models  a  type  of  conceptual  change. 
Changing  the  presuppositions  of  a  framework  theory  -  a  type  of  belief  revision  -  is  a  key 
operation  in  Vosniadou’ s  theory  of  conceptual  change. 

Also  in  disagreement  with  Carey,  diSessa  (2006)  argues  against  the  necessity  of 
incommensurability  within  conceptual  change.  Collecting  and  coordinating  elements  of 
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knowledge  is  the  mechanism  of  conceptual  change  for  KiP,  so  incommensurability  is  not  a 
worthwhile  distinction. 

This  particular  point  of  contention  concerns  terminology  rather  than  human  cognitive 
processes.  Carey  (2009)  states  clearly  that,  [conceptual  change’  means  change  in  individual 
concepts”  (pp.  354),  but  the  other  theories  -  most  notably,  Chi’s  -  include  other  manners  of  non¬ 
monotonic  belief  revision  (i.e.,  removing  beliefs  to  accommodate  new  information).  We  include 
mental  model  transformation  in  our  definition  of  conceptual  change,  as  described  in  Chapter  1 . 
We  also  include  category  revision,  which  abides  by  Carey’s  definition  of  conceptual  change. 

2.2.2  Revision  versus  addition 

There  is  a  deep  but  subtle  distinction  between  these  theories  of  conceptual  change  that  has  not,  I 
believe,  been  given  sufficient  attention.  It  concerns  the  revision  of  information  in  memory. 
Consider  the  following  example  of  conceptual  change:  a  student  is  learning  Newtonian 
dynamics.  She  generally  operates  with  a  flawed  account  of  force,  in  that  it  is  substance-like 
(Reiner  et  ah,  2000),  impetus-like  (Ioannides  and  Vosniadou,  2002),  or  it  includes  the  “force-as- 
mover”  p-prim  (diSessa,  1993).  Consequently,  she  generally  believes  that  motion  implies  the 
existence  of  a  force.  We  call  this  initial  account  of  force  Force[l].  Consider  also  that  the  student 
is  guided  through  a  Newtonian  explanation  of  a  puck  sliding  on  ice  at  constant  velocity,  ignoring 
friction  (i.e.,  the  ice’s  upward  force  against  the  puck  counters  the  downward  force  of  gravity  on 
the  puck,  resulting  in  a  zero  net  force).  After  consideration,  she  now  confidently  understands 
this  phenomenon  PI  with  a  Newtonian  concept  of  force  Force [2], 

This  raises  several  questions  which  have  far-reaching  implications.  How  is  this  new  concept 
of  force  Force[2]  stored  relative  to  the  old  concept  Force[l]?  Does  the  Force[l]  shift/change 
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directly,  thus  becoming  Force[2]?  Is  an  entirely  new  Force[2]  concept  added,  e.g.,  as  a 
placeholder  or  by  copying  and  revising  the  old  concept?  We  call  this  the  problem  of  information 
revision.  Figure  4  illustrates  an  extremely  simplified  topology  of  how  infonnation  might  be 
revised  in  the  student’s  memory.  The  white  nodes  are  phenomena  in  the  student’s  memory,  and 
the  black  nodes  are  categories  (of  force)  that  have  been  used  to  explain  these  phenomena.  Figure 
4(a)  plots  the  comprehension  of  three  phenomena  P1-P3  before  learning  Force[2],  and  Figure 
4(b-f)  shows  live  possible  accounts  of  the  student’s  state  after  learning  Force[2]  and  its  relevance 
to  PI.  Figure  4  does  not  represent  knowledge  at  the  proper  granularity  for  each  of  the  four 
theories  (e.g.,  for  KiP,  force  is  represented,  in  part,  by  a  causal  network),  and  it  does  not  include 
all  imaginable  schemes  of  information  revision.  However,  it  is  suitable  for  discussing 
differences  in  conceptual  change  theories.  We  discuss  each  of  the  information  revision  schemes 
shown  in  Figure  4,  some  assumptions  behind  them,  and  some  implications  for  theories  of 
conceptual  change.  We  refer  to  the  previously  existing  category  (e.g.,  Force[l])  as  the  prior 
category,  and  the  new/revised  category  (e.g,  Force[2])  as  the  subsequent  category. 

If  categories  are  directly  revised  as  in  Figure  4(b-c),  then  the  prior  category  literally 
becomes  the  subsequent  category,  and  afterward  there  is  no  trace  of  the  prior.  In  the  case  of 
Figure  4(b),  the  learner  immediately  loses  understanding  of  phenomena  (e.g.,  P2  and  P3)  that 
were  understandable  in  terms  of  the  old  concept  but  not  in  tenns  of  the  new  concept.  This  seems 
unlikely,  since  students  have  access  to  their  misconceptions  after  becoming  acquainted  with 
scientific  concepts  (Clement,  1982). 
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Figure  4:  Five  possible  accounts  of  how  category  information  is  revised.  Black  and  white  nodes  represent 
categories  and  phenomena,  respectively.  Arrows  indicate  “is  understood  in  terms  of.”  Dotted  zones  indicate 
contexts,  (a)  Initial  state  with  category  Force[l].  (b-f)  Possible  resultant  states  after  incorporating  Force[2). 

A  second  variety  of  direct  revision  is  depicted  in  Figure  4(c):  the  prior  category  is  directly 
revised  into  the  subsequent  category,  and  the  learner  immediately  comprehends  previous 
phenomena  (e.g.,  P2  and  P3)  in  terms  of  the  subsequent  category.  This  shares  the  same  problem 
we  mentioned  for  direct  revision,  and  also  creates  more  problems.  First,  it  is  unlikely  that  the 
Force[l]  and  Force[2]  categories  overlap  perfectly  in  the  range  of  phenomena  they  can  explain, 
so  a  perfect  substitution  is  not  plausible.  Additionally,  there  is  empirical  evidence  that  novices 
can  utilize  different  conceptual  knowledge  based  on  the  phenomena  that  needs  to  be  explained. 
For  instance,  70%  of  the  novice  subjects  in  diSessa  et  al.  (2004)  claimed  that  different  forces 
were  at  work  in  phenomena  similar  to  P2  and  P3  described  in  Figure  4(a).  Similarly,  Collins  and 
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Gentner  (1987)  interviewed  a  subject  who  explained  two  slightly  different  instances  of 
evaporation  with  different,  mutually  incoherent,  evaporation  mechanisms. 

Some  of  these  problems  with  direct  revision  can  be  solved  by  assuming  that  prior  and 
subsequent  categories  actually  coexist  for  some  time.  In  this  case,  conceptual  change  involves 
copying  and  revising  (hereafter  copy-revising)  the  prior  knowledge  to  create  a  minimally  or 
radically  different  subsequent  knowledge.  Copy-revision  is  shown  in  Figure  4(d),  where  the 
previous  category  is  still  used  to  understand  P2  and  P3,  but  understanding  of  PI  has  been  shifted 
to  the  subsequent  concept.  The  prior  knowledge  (e.g.,  a  substance-like  category  of  force)  and  the 
subsequent  knowledge  (e.g.,  Newtonian  force)  have  different  existence  conditions  and 
consequences,  so  they  are  mutually  incoherent.  If  we  assume  copy-revision  happens,  then  we 
have  many  other  questions  to  answer:  How  do  people  form  coherent  explanations  with 
incoherent  knowledge?  How  does  a  student  eventually  use  the  subsequent  knowledge  in  place  of 
the  prior  knowledge,  where  applicable?  What  mechanisms  monitor  the  performance  of  the  prior 
and  subsequent  concepts  and  shift  their  contexts? 

A  fourth  possibility  is  shown  in  Figure  4(e):  categories  are  copy-revised  and  the  prior 
category  is  quarantined.  In  quarantine,  the  prior  category  cannot  be  used  to  explain  new 
phenomena  -  it  only  exists  until  the  phenomena  it  supports  (e.g.,  P2  and  P3)  are  understood  in 
terms  of  other  concepts.  This  makes  the  unlikely  assumption  that  once  a  student  has  a  small 
foothold  in  Newtonian  dynamics  she  immediately  discredits  her  prior  intuitive  concepts  across 
all  possible  contexts. 

A  final  information  revision  scheme  is  shown  in  Figure  4(f):  categories  are  copy-revised  and 
the  subsequent  Force[l]  and  consequent  Force[2]  categories  are  explicitly  contextualized  in 
Context[l]  and  Context[2],  respectively.  These  contexts  then  behave  as  walls  to  maintain 
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internal  coherence  of  the  knowledge  within.  This  solves  the  potential  problem  of  incoherence 
and  it  allows  the  prior  category  Force[l]  to  continue  to  be  utilized  selectively.  However,  this 
also  raises  additional  questions:  Are  new  contexts  established  whenever  any  category  revision 
occurs?  What  prevents  a  combinatorial  explosion  of  contexts?  How  do  phenomena  (e.g.  P1-P3) 
come  to  be  understood  in  tenns  of  the  subsequent  category  in  the  new  context? 

None  of  the  information  revision  schemes  in  Figure  4  are  themselves  complete  theories  of 
conceptual  change.  But  they  suggest  that  information  revision  -  even  at  a  very  abstract  level  - 
has  wide  implications  for  theories  of  conceptual  change,  especially  those  that  make  claims  about 
coherence  and  categorical  shift. 

For  each  of  the  four  conceptual  change  theories,  we  discuss  their  commitment  to  how 
information  is  organized  and  revised  with  respect  to  a  student  learning  Newtonian  concept 
Force[2]  in  the  presence  of  Force[l].  Some  of  the  theories  do  not  take  a  clear  stand  with  respect 
to  whether  prior  and  subsequent  concepts  (i.e.,  beliefs,  categories,  and  mental  models)  can  exist 
simultaneously,  so  our  analysis  includes  some  speculation. 

In  Carey’s  (2009)  account  of  Quinian  bootstrapping,  a  student  learning  Newtonian  dynamics 
would  establish  another  conceptual  system  with  a  placeholder  symbol  for  “force.”  This  entails  at 
least  the  following  operations:  (1)  recognize  that  the  present  and  new  concepts  of  force  are 
incoherent  (i.e.,  incommensurable);  (2)  establish  a  new  conceptual  system  CS2  for  everyday 
dynamics;  (3)  create  a  placeholder  symbol  for  the  new  force  concept  in  CS2;  (4)  create 
placeholder  symbols  in  CS2  for  related  concepts  (e.g.,  acceleration  and  mass)  and  relations 
between  them;  and  (5)  enrich  CS2  using  modeling  processes.  These  operations  illustrate  that 
Carey’s  theory  does  not  involve  direct  revision  of  categories.  Rather,  it  involves  a  very  shallow 
copy-revision  (more  of  an  addition)  since  the  subsequent  concept  is  only  a  placeholder  symbol. 
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This  is  most  similar  to  Figure  4(f),  where  Context[l]  represents  CSi  and  Context[2]  represents 
CS2,  although  both  contexts  are  clearly  lacking  other  quantities  and  placeholder  symbols. 
Coherence  is  enforced  at  the  granularity  of  conceptual  systems,  since  the  prior  and  subsequent 
concepts  are  stored  in  different  conceptual  systems.  Step  5  describes  how  the  new  conceptual 
system  obtains  content,  but  it  is  not  clear  how  real-world  phenomena  come  to  be  explained  in 
terms  of  the  new  conceptual  system  CS2  with  Force[2]  rather  than  the  previous  system  CSi  with 
Force[l], 

Chi’s  (2008;  Reiner  et  al.,  2000)  account  of  categorical  shift  is  less  straightforward  with 
respect  to  the  retention  of  previous  beliefs  and  categories.  The  conjecture  of  Chi  and  colleagues 
is  that  the  concept  of  force  starts  as  a  subordinate  category  of  “Substances”  for  most  novices,  and 
then  is  shifted  to  become  a  subordinate  of  the  lateral  category  “Constraint-based  interactions” 
under  the  “Processes”  ontological  tree.  Unlike  Carey’s  theory,  Chi’s  theory  does  not  mention  the 
establishment  of  a  new  conceptual  system  that  permits  Force[l]  and  Force[2]  to  coexist. 
Ioannides  and  Vosniadou  (2002)  note  that  “Chi  and  colleagues  seem  to  believe  that  conceptual 
change  is  a  radical  process  that  happens  in  a  short  period  of  time  as  an  individual  leams  the 
correct  ontology  for  a  given  concept”  (pp.  7).  In  defense  of  Chi  and  colleagues,  Chi  (2008)  notes 
that  conceptual  change  only  happens  quickly  if  the  learner  is  already  familiar  with  the  target 
category  (e.g.,  “Constraint-based  interactions”)  of  the  categorical  shift.  Otherwise,  the  learner 
must  learn  the  properties  of  the  target  category,  e.g.,  via  formal  instruction,  before  they  can 
complete  the  categorical  shift  (Chi,  2008;  2005).  So,  Chi’s  theory  of  conceptual  change  is 
prolonged  over  the  enrichment  of  the  target  category.  After  this  is  achieved,  the  concept 
Force[l]  appears  to  be  directly  revised/shifted  (e.g.,  as  in  Figure  4b-c),  so  the  prior  and 
subsequent  concepts  do  not  exist  simultaneously.  Further,  this  suggests  that  conceptual  change 
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of  the  force  concept  would  be  trivial  (or  even  instantaneous)  if  the  learner  was  already  familiar 
with  “Constraint-based  interactions.” 

According  to  Vosniadou’s  theory,  changing  the  meaning  of  force  is  a  gradual  transition  from 
an  “initial”  meaning  of  force  through  a  series  of  “synthetic”  meanings  of  force  that  incorporate 
aspects  of  the  initial  theory  with  scientific  knowledge  (Ioannides  and  Vosniadou,  2002).  The 
overall  change  from  intuitive  to  scientific  concepts  of  force  is  gradual  due  to  smaller  changes  in 
the  beliefs  and  presuppositions  (described  above)  that  comprise  the  learner’s  framework  theory. 
Some  of  these  changes  in  the  meaning  of  force  occur  spontaneously.  For  example,  a  student 
with  an  internal  meaning  of  force  (i.e.,  force  is  an  internal  property  of  physical  objects  affected 
by  weight  and/or  size)  might  notice  that  objects  appear  to  acquire  forces  which  sustain  their 
movement.  This  is  inconsistent  with  the  idea  that  forces  are  only  internal.  Since  the  learner  is 
committed  to  coherence,  “ acquired  and  internal  force  cannot  coexist”  (Ioannides  &  Vosniadou, 
pp.  41,  their  emphasis).  Thus,  the  learner  spontaneously  shifts  to  an  acquired  meaning  of  force 
(i.e.,  objects  acquire  forces  which  cause  movement). 

The  assertion  that  internal  and  acquired  meanings  of  force  cannot  coexist  suggests  that 
Vosniadou’s  theory  involves  directly  revising  the  prior  concept  -  or  at  least  immediately 
eliminating  it.  Thus,  in  Vosniadou’s  theory,  the  prior  and  subsequent  concepts  do  not  exist 
simultaneously.  Had  the  authors  stated  that  these  meanings  of  force  cannot  coexist  in  the  same 
framework,  then  we  would  conclude  that  Vosniadou’s  mechanism  of  change  involves 
quarantined  copy-revision.  Unlike  Chi’s  theory,  Vosniadou’s  theory  segments  the  larger  change 
from  initial  to  Newtonain  force  into  a  series  of  incremental  conceptual  changes;  however,  like 
Chi’s  theory,  the  individual  changes  are  conducted  by  directly  revising  the  framework  theory  and 
concepts  embedded  therein. 
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In  the  knowledge  in  pieces  literature,  diSessa  and  Sherin  (1998)  use  the  term  coordination 
class  to  describe  a  connected  set  of  strategies  for  gathering  information  and  understanding  the 
world.  In  this  account,  physical  quantities  (e.g.,  force  and  velocity)  are  considered  coordination 
classes  rather  than  categories  (e.g.,  bird  or  hammer).  This  is  because  quantities  often  connect 
preconditions  to  particular  outcomes  in  a  causal  net  which  is  part  of  a  coordination  class. 
diSessa  and  Sherin  use  the  equation  F  =  ma  to  exemplify  a  causal  net13  since  the  existence  of  a 
force  “causes”  acceleration:  we  can  determine  force  by  observing  acceleration  and  we  can 
predict  acceleration  by  knowing  force.  The  authors  perform  an  in-depth  analysis  on  the 
interview  transcript  of  a  student  describing  the  forces  that  exist  when  a  hand  pushes  a  book  along 
the  surface  of  a  table.  The  authors  explain  the  student’s  problem-solving  difficulties  in  terms  of 
competing  causal  nets:  a  Newtonian  F  =  ma  causal  net  applies  to  the  situation  but  makes 
predictions  that  she  believes  are  inconsistent,  so  she  excludes  the  situation  from  F  =  ma  and 
instead  uses  an  intuitive  causal  net.  This  suggests  that  intuitive  and  instructional  conceptual 
structures  -  which  are  mutually  incoherent  -  simultaneously  coexist  and  compete  to  explain 
phenomena.  This  is  a  clear  example  of  addition/copy-revision  in  Figure  4(d),  where  Force[l] 
and  Force[2]  indicate  different  coordination  classes. 

Our  analysis  suggests  that  there  are  disagreements  among  these  theories  on  the  foundational 
issue  of  how  information  is  revised.  Carey’s  theory  and  KiP  both  involve  the  establishment  of 
new  conceptual  structures  that  coexist  with  prior  structures;  however,  the  theories  disagree  on 
how  the  new  and  old  structures  are  contextualized.  Chi’s  and  Vosniadou’s  theories  apparently 
rely  on  the  direct  revision  of  concepts  once  the  appropriate  category  of  a  concept  is  learned 
(according  to  Chi)  or  once  the  presuppositions  and  theories  of  the  framework  permit  it 
(according  to  Vosniadou). 

13  Not  all  causal  nets  are  equations,  since  students  have  many  qualitative  assumptions  about  quantities  and  causality. 
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One  objection  to  this  analysis  is  that  theories  of  conceptual  change  theories  can  be 
noncommittal  about  how  information  is  revised  -  after  all,  it  is  often  advantageous  to  discuss 
cognition  at  different  levels  of  abstraction  (e.g.,  Marr,  1982).  In  counter-argument,  each  of  these 
theories  of  conceptual  change  makes  a  claim  about  the  presence  or  absence  of  coherence. 
Coherence  has  implications  for  the  information  revision  scheme,  and  visa-versa.  Consequently, 
conceptual  change  theories  should  describe  the  relationship  between  prior  and  subsequent 
knowledge,  including  whether  they  coexist  and  how  they  are  contextualized. 

The  issue  of  whether  new  information  coexists  with  previous,  conflicting  knowledge  -  and 
how  it  does  so  -  has  implications  for  coherence,  the  role  of  context,  the  mechanisms  and 
complexity  of  change,  and  the  process  of  understanding.  I  believe  that  most  of  the 
disagreements  among  conceptual  change  theories  stem  from  vagueness  and  disagreement  on  this 
fundamental  issue. 

2.2.3  The  effect  of  explaining  on  the  process  of  change 

The  research  of  Chi  and  colleagues  (Chi  et  al.,  1994a;  Chi,  2000;  de  Leeuw  &  Chi,  2002)  has 
characterized  the  self-explanation  effect,  where  explaining  new  infonnation  to  oneself  helps 
repair  flawed  mental  models.  Chi  et  al.  (1994a)  determined  that  students  who  explain  to 
themselves  while  reading  a  textbook  passage  -  even  when  prompted  by  an  experimenter  to  do  so 
-  perform  better  on  a  posttest  than  students  who  simply  read  the  passage  twice.  Frequent  self¬ 
explainers  experience  the  greatest  benefit.  Chi  (2000)  describes  the  mechanism  by  which  self- 
explaining  promotes  mental  model  transformation:  (1)  explaining  the  new  knowledge  causes 
recognition  of  qualitative  conflicts  (i.e.,  different  predictions  and  structure)  between  a  mental 
model  and  the  text  model;  (2)  the  conflict  is  propagated  in  the  mental  model  to  find 
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inconsistencies  in  the  consequences;  and  (3)  the  mental  model  is  repaired  using  elementary 
addition,  deletion,  concatenation,  or  feature  generalization  operators.  In  short,  self-explanation 
finds  contradictions  within  implicit  conflicts,  thus  causing  belief  revision.  This  can  result  in 
mental  model  transformation  in  Chi’s  framework,  as  described  above. 

Constructing  an  explanation  for  peer  interaction  can  have  the  same  beneficial  effects  on 
learning  as  self-explanation  (Webb,  1989).  Both  explanation  scenarios  require  that  we  make 
sense  of  relevant  information;  however,  explaining  to  somebody  else  requires  that  we  monitor 
the  listener’s  comprehension,  which  might  distract  from  our  learning. 

In  Vosniadou’s  theory  of  conceptual  change,  “specific  explanations”  (synonymous  with 
“specific  theory;”  Ioannides  and  Vosniadou,  2002)  are  embedded  within  a  larger  framework 
theory.  It  is  not  clear  whether  “specific  explanation”  refers  to  Chi’s  notion  of  explanation,  but  it 
appears  that  explanations  -  or  the  specific  theoretical  components  thereof  -  are  persistent 
structures  (unlike  Chi’s  theory).  As  in  Chi’s  theory,  constructing  a  new  explanation  can  revise  or 
replace  these  structures  within  the  larger  framework.  Since  we  have  too  little  infonnation  on 
how  explanation  affects  conceptual  change  in  Vosniadou’s  theory,  we  do  not  speculate  any 
further. 

At  the  heart  of  Carey’s  (2009)  account  of  Quinian  bootstrapping  are  modeling  processes  that 
provide  meaning  for  placeholder  structures  in  a  new  conceptual  system.  These  modeling 
processes  include  analogy,  induction,  thought  experiments,  limiting  case  analyses,  and  abduction 
(i.e.,  reasoning  to  the  best  explanation).  Both  analogy  and  abduction  are  relevant  mechanisms  of 
explanation  for  our  discussion. 14  These  explanation  processes  generate  the  actual  content  of  a 


14  Chi  et  al.  (1994a)  use  the  spontaneous  analogy  “the  septum  [of  the  heart]  is  like  a  wall”  as  an  example  of  a  self¬ 
explanation  (pp.  454-455),  so  we  include  analogy  in  our  discussion  of  the  effect  of  explanation. 
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new  conceptual  system  by  (1)  importing  knowledge  from  other  domains  via  analogy,  and  (2) 
making  coherent  assumptions  via  abduction. 

Chi  and  Carey  are  assuming  the  same  explanatory  mechanisms  (i.e.,  model-based  abduction 
and  analogy)  but  in  reference  to  different  types  of  change.  Chi  discusses  how  explanation 
promotes  mental  model  transformation  by  repairing  conflicts,  and  Carey  discusses  how  it 
enriches  a  new  conceptual  system  for  incommensurable  conceptual  change.  We  believe  that 
constructing  explanations  can  play  both  of  these  roles,  and  our  computational  model  constructs 
explanations  to  achieve  both  of  these  types  of  conceptual  change  (i.e.,  mental  model 
transformation  and  category  revision).  Our  computational  model  does  not  simulate  all  of  the 
modeling  processes  mentioned  by  Carey  (2009),  nor  does  it  model  Quinian  bootstrapping  in  its 
entirety. 

From  the  KiP  perspective,  constructing  an  explanation  involves  combining  and  jointly  using 
multiple  pieces  of  knowledge.  diSessa  (1993)  notes  that  using  multiple  p-prims  in  dynamic 
sequence  or  standard  clusters  accounts  for  these  p-prims  to  raise  or  lower  their  structured 
priority  simultaneously,  where  structured  priority  refers  to  (1)  the  strength  of  the  connections 
between  a  p-prim  and  previously  activated  elements  and  (2)  its  likelihood  of  remaining  activated 
during  subsequent  processing.  This  indicates  that  explaining  shifts  the  context  of  conceptual 
structures.  This,  too,  is  a  role  of  explanation  in  our  computational  model. 

We  see  no  explicit  disagreement  regarding  the  role  of  explanation  in  conceptual  change. 

Each  theory  describes  a  separate  effect  of  explaining,  but  these  effects  are  mutually  consistent. 
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2.2.4  The  source  of  coherence 

There  is  wide  consensus  that  coherence  is  a  desirable  property  of  explanations  (Thagard,  2007; 
Lombrozo,  2011),  and  that  people  revise  their  explanations  to  cohere  with  credible  knowledge 
(Sherin  et  ah,  2012).  There  is  less  agreement,  however,  on  the  source  of  coherence,  and  even  on 
the  definition  of  coherence  (diSessa  et  ah,  2004;  Ioannides  and  Vosniadou,  2002;  Thagard, 

2000).  Where  the  definition  of  coherence  is  more  subjective,  we  discuss  the  dispute  over  the 
more  general  -  and  less  ambiguous  -  epistemic  property  of  logical  consistency.  In  short,  if  a  set 
of  beliefs  and  mental  models  do  not  directly  entail  a  contradiction,  they  are  logically  consistent. 15 
Logical  consistency  is  necessary  but  not  sufficient  for  coherence.  We  do  not  assume  that  all 
possible  contradictions  are  immediately  detected  by  the  learner,  so  for  our  discussion, 
“consistency”  refers  to  perceived  consistency  rather  than  objective  logical  consistency.  We 
discuss  the  disagreement  among  conceptual  change  theories  about  the  role  and  source  of 
consistency,  which  helps  illustrate  the  more  complicated  dispute  about  coherence. 

To  begin,  we  must  define  coherence  and  consistency  as  a  quantified  property.  A  set  of 
beliefs  and  mental  models  can  be  internally  consistent  if  they  do  not  entail  a  contradiction, 
regardless  of  beliefs  and  mental  models  outside  of  the  set.  Beliefs  are  globally  consisten  t  if  the 
superset  of  all  beliefs  and  models  of  the  learner  do  not  entail  a  contradiction.  Internal  and  global 
coherence  can  be  bounded  in  a  similar  fashion,  but  coherence  is  stricter  than  logical  consistency. 

Carey’s  theory  assumes  coherence  -  and  therefore  logical  consistency  -  within  conceptual 
systems.  When  a  learner  utilizes  a  coherent,  intuitive  conceptual  system  CSi  and  encounters  an 
instructional  concept  that  is  incommensurable  with  CSi,  he  or  she  establishes  a  new,  coherent 

15  We  do  not  assume  that  the  set  of  beliefs  and  models  is  deductively  closed,  since  this  is  not  presumed  of  any  of  the 
theories  of  conceptual  change.  Consequently,  we  are  referring  to  contradictions  that  are  entailed  directly  from  this 
knowledge. 
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conceptual  system  CS2.  While  the  learner  acquires  content  and  relation  structure  for  CS2,  the 
knowledge  in  CSi  is  still  available.  Conceptual  systems  CSi  and  CS2  are  internally  consistent, 
but  CSi  and  CS2  may  be  mutually  inconsistent,  so  the  learner’s  knowledge  may  be  globally 
inconsistent.  For  Carey,  the  granularity  of  consistency  is  at  the  level  of  conceptual  systems,  and 
it  appears  to  be  a  hard  constraint.  Interestingly,  a  learner’s  knowledge  must  be  globally 
incoherent  in  Carey’s  theory,  since  incoherence  is  a  necessary  property  of  incommensurability, 
and  incommensurability  is  a  precursor  for  establishing  the  new  conceptual  system  CS2. 
Consequently,  Carey  assumes  internal  coherence  of  conceptual  systems  and  global  incoherence 
among  the  union  of  all  conceptual  systems. 

In  Chi’s  theory,  beliefs  and  mental  models  are  revised  when  logical  inconsistencies  are 
detected.  This  is  triggered  via  belief-level  refutation  or  via  self-explanation,  which  propagates 
implicit  conflicts  into  explicit  contradictions  (Chi,  2008).  In  Chi’s  theory,  consistency  does  not 
appear  to  be  a  hard  constraint  on  conceptual  systems,  but  the  lack  of  consistency  in  a  conceptual 
system  drives  the  revision  of  components.  Consistency  therefore  is  a  soft  constraint  (i.e.,  it  is 
desired  but  not  required). 

In  Vosniadou’s  theory,  two  inconsistent  concepts  (e.g.,  meanings  of  force)  cannot  coexist 
within  the  same  framework  theory  (Ioannides  and  Vosniadou,  2002).  When  an  inconsistency  is 
detected  within  a  framework  theory,  it  is  immediately  remedied.  This  is  because  mental  models 
are  “dynamic,  situated,  and  constantly  changing  representations  that  adapt  to  contextual 
variables”  (Vosniadou,  2007,  pp.  1 1).  Unlike  Carey’s  theory,  Vosniadou’s  theory  does  not 
mention  the  establishment  of  a  new  context  to  store  the  inconsistent  concept,  so  it  is  not  clear 
whether  the  old  concept  exists.  Since  framework  theories  are  internally  consistent  and 
inconsistent  concepts  are  removed  from  them,  Vosniadou’s  theory  appears  to  assume  global 
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consistency  in  a  student’s  knowledge.  However,  Vosniadou  (2007)  further  argues  that  students 
lack  metaconceptual  awareness  of  their  beliefs,  and  that  promoting  this  awareness  is  an  integral 
part  of  teaching  for  conceptual  change.  This  suggests  that  inconsistency  and  incoherence  may 
frequently  go  undetected  by  novice  students,  which  weakens  this  global  coherence  constraint 
considerably. 

Knowledge  in  pieces  involves  the  coexistence  of  new  and  old  conceptual  structures  that  are 
globally  incoherent  and  that  make  globally  inconsistent  predictions.  Coherence  and  consistency 
are  therefore  not  properties  of  the  knowledge  system,  but  they  are  generally  properties  of  the 
explanations  that  are  constructed  from  it.  When  individual  knowledge  elements  (e.g.,  p-prims) 
are  combined  to  form  a  coherent  explanation,  their  structured  priorities  are  modified  (diSessa, 
1993).  As  a  result,  knowledge  elements  that  are  coordinated  coherently  (and  therefore, 
consistently)  are  more  likely  to  be  activated  together  in  the  future.  Coherence  and  consistency 
spread  as  new  combinations  of  knowledge  are  considered  and  as  knowledge  elements  are  used  in 
new  contexts.  Since  the  explanation  process  has  a  bias  toward  coherence,  coherence  emerges 
from  this  process  rather  than  from  the  knowledge  system  directly. 

In  summary,  there  are  direct  disagreements  about  the  source  of  consistency  and  coherence  in 
explanations  and  knowledge  systems.  From  the  KiP  perspective,  the  knowledge  system  is 
incoherent,  and  coherence  is  a  product  of  coordinating  knowledge  into  explanations  based  on 
dynamic  activation  priorities.  In  contrast,  the  other  three  theories  rely  on  one  or  more  generally 
coherent  conceptual  systems  prior  to  explanation  construction.  According  to  Chi  and  Vosniadou, 
incoherence  is  a  cue  to  modify  a  conceptual  system  by  revising  beliefs  and  mental  models  and 
the  categories  used  to  represent  them.  Carey  agrees  that  incoherence  can  lead  to  belief  revision 
and  enrichment  within  a  single  conceptual  system,  but  disagrees  that  it  causes  incommensurable 
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changes  such  as  categorical  shift  within  a  single  conceptual  system.  For  Carey,  when 
inconsistency  is  accompanied  by  incommensurability  during  formal  education,  it  is  a  cue  for 
establishing  a  new  conceptual  system  altogether,  which  is  internally  coherent  and  consistent. 

2.3  The  path  forward 

Our  computational  model  of  conceptual  change  can  shed  light  on  the  areas  of  disagreement  and 
divergence  discussed  in  this  chapter:  how  infonnation  is  revised,  the  role  of  explanation,  and  the 
source  of  coherence.  Our  computational  model  is  not  an  implementation  of  any  of  these  four 
theories;  the  psychological  assumptions  of  our  model  conflicts  in  some  ways  with  each  of  the 
theories  described  above.  Further,  our  model  of  conceptual  change  is  not  complete  with  respect 
to  any  of  these  theories  -  there  are  many  things  it  does  not  model,  including  the  following:  (1) 
the  development  of  metacognitive  awareness  of  one’s  beliefs  (Vosniadou,  2007);  (2)  the  full 
spectrum  of  model-based  processes  that  enrich  a  new  conceptual  system  (Carey,  2009);  and  (3) 
spontaneous  analogies  for  self-explanation  (Chi,  1994a).  We  therefore  cannot  expect  this  -  or 
any  -  single  cognitive  model  to  reconcile  all  four  theories  outlined  in  this  chapter.  More 
reasonable  goals  for  our  computational  model  include  the  following:  (1)  develop  a  system  for 
representing  and  contextualizing  conceptual  knowledge;  (2)  integrate  the  roles  of  explanation  in 
each  conceptual  change  theory  into  a  single  framework;  and  (3)  demonstrate  that  a  knowledge 
system  can  indeed  be  globally  incoherent  yet  still  produce  coherent  explanations. 


71 


Chapter  3:  Background 

Our  computational  model  of  conceptual  change  draws  upon  a  number  of  areas  of  AI.  For 
instance,  qualitative  modeling  -  a  research  area  initially  motivated  by  the  study  of  human  mental 
models  -  provides  us  with  a  composable  knowledge  representation  and  a  vocabulary  for 
descriptive  and  mechanism-based  models.  Computational  cognitive  models  of  analogical 
mapping,  reminding,  and  generalization  can  be  used  for  comparison,  retrieval,  and  induction, 
respectively.  We  can  also  use  existing  AI  technology  for  logically  contextualizing  information 
and  for  tracking  the  rationale  of  beliefs  and  their  underlying  assumptions.  Finally,  we  can  use 
existing  tools  to  automatically  encode  sketches  into  relational  knowledge,  to  rapidly  and  reliably 
create  data  for  learning  and  testing  in  modalities  familiar  to  people. 

3.1  Ontologies 

An  ontology  represents  a  set  of  categories  (also  called  collections )  and  the  relationships  between 
them.  Each  category  represents  some  type  of  object/substance  (e.g.,  Dog,  ContainedFluid, 
HeartValve)  or  event/situation  (e.g.,  FluidFlow,  PhysicalTransfer,  BuyingADrink). 
These  collections  are  part  of  the  vocabulary  with  which  beliefs  are  represented.  For  instance,  we 
can  assert  the  statement  (isa  entity2034  Dog)  to  say  that  the  symbol  entity2 03 4  is  an 
instance  of  the  collection  Dog,  or  more  casually,  that  entity2  034  is  a  Dog.  Ontologies  contain 
relationships  between  collections.  For  example,  the  statement  (genls  Dog  CanineAnimal ) 
states  that  all  instances  of  the  subordinate  collection  Dog  are  also  instances  of  superordinate 
category  CanineAnimal,  but  not  necessarily  the  other  way  around.  This  makes  ontologies 
hierarchical.  Figure  5  illustrates  a  small  portion  of  the  OpenCyc  ontology  which  includes  Dog 
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and  Canine  Animal  collections.  We  use  the  OpenCyc  ontology  for  our  cognitive  model,  but  we 
only  use  very  small  portions  of  it.  The  OpenCyc  ontology  was  not  constructed  with  the  intent  of 
modeling  novice  learners  -  quite  the  opposite,  in  fact  -  so  we  make  heavy  use  the  isa  and 
genls  relations  but  only  minimal  use  of  the  abstract  content. 


A 


A 


A 


A 


NonPersonAnimal  TameAnimal  TerrestiralOrganism  Carnivore 


Beagle  Dalmatian  Whippet 

Figure  5:  A  small  portion  of  the  OpenCyc  ontology.  An  arrow  a->b  indicates  (genls  a  b). 

On  a  related  note,  Chi’s  (2008)  theory  of  conceptual  change,  outlined  in  Chapter  2,  assumes 

the  existence  of  “ontological  trees.”  These  share  the  hierarchical  property  of  the  ontologies 

described  here;  however,  it  is  not  clear  that  categories  in  Chi’s  ontological  trees  can  inherit  from 

multiple  superordinate  categories  as  illustrated  in  Figure  5. 


3.2  Qualitative  reasoning 

In  the  introduction,  we  mentioned  popular  examples  of  conceptual  change,  including  the 
changing  concepts  of  force,  heat,  and  temperature.  Changes  in  other  concepts  such  as  speed, 
velocity,  momentum,  acceleration,  mass,  weight,  light,  and  electricity  have  also  been 
characterized  in  the  literature  (Reif,  1985;  Dykstra  et  ah,  1992;  Reiner  et  al.,  2000). 


73 


Interestingly,  all  of  these  concepts  are  represented  as  quantities  at  some  point  in  the  trajectory  of 
misconceptions,  and  most  of  them  are  represented  as  quantities  throughout.  Consequently, 
modeling  conceptual  change  involves  representing  and  reasoning  with  quantities  and  also 
revising  the  existential  and  behavioral  properties  of  quantities. 

“Quantity”  is  not  synonymous  with  “number.”  A  quantity  (e.g.,  the  volume  of  lemonade  in 
a  pitcher)  may  be  assigned  a  numerical,  unit-specific  value  (e.g.  12  fluid  ounces)  at  a  specific 
time.  But  people  can  reason  very  effectively  without  numbers.  For  instance,  we  might  know 
that  the  volume  of  lemonade  in  a  pitcher  is  greater  than  zero  ounces  and  less  than  the  volume  of 
the  pitcher  (e.g.,  64  ounces).  If  the  height  of  the  lemonade  is  millimeters  below  the  rim  of  the 
pitcher,  we  might  estimate  that  the  volume  is  roughly  six-glasses-worth,  or  just  use  a  qualitative 
label  such  as  a  lot  to  represent  the  volume,  based  on  how  the  estimate  anchors  within  our  space 
of  experiences  (Paritosh,  2004).  Without  numerical  knowledge,  we  can  also  reason  about 
causality.  For  example  (quantities  in  italics),  we  know  that  if  we  increase  the  angle  of  the 
pitcher,  the  height  of  the  pitcher  lip  will  decrease.  Once  it  decreases  below  the  height  of  the 
lemonade,  a  fluid  flow  will  start,  and  as  we  continue  to  increase  the  angle  of  the  pitcher,  we  will 
also  increase  the  rate  of  flow.  In  this  example,  we  used  the  words  “increase”  and  “decrease”  to 
refer  to  the  direction  of  change  of  a  quantity’s  value,  and  we  used  ordinal  relationships  such  as 
“below,”  to  refer  to  inequalities  between  the  values  of  two  quantities.  In  this  manner,  people  can 
reason  qualitatively  about  continuous  quantities,  rates  and  directionalities  of  change,  and  ordinal 
relationships  (i.e.,  greater  than,  less  than,  equal  to)  between  them.  A  large  literature  describes 
formal  approaches  for  representing  and  reasoning  about  processes  (e.g.,  Forbus,  1984)  and 
devices  (e.g.,  de  Kleer  &  Brown,  1984),  and  simulating  systems  provided  this  knowledge  (e.g., 
Kuipers,  1986). 
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Novices  and  experts  alike  often  reason  with  incomplete  and  imprecise  qualitative 
knowledge,  especially  in  situations  of  informational  uncertainty  (Trickett  &  Trafton,  2007). 
Consider  the  following  incorrect  near-far  novice  explanation  of  how  the  seasons  change  (Sherin 
et  ah,  2012):  the  earth  orbits  the  sun  along  an  elliptical  path  and  is  closer  to  the  sun  in  the 
summer  than  in  the  winter.  This  mental  model  includes  no  numbers,  but  mentions  quantities 
(e.g.,  the  distance  between  the  earth  and  the  sun,  the  temperature  of  the  earth)  and  relations 
between  quantities  (e.g.,  the  earth’s  temperature  strictly  increases  as  its  distance  to  the  sun 
decreases).  This  is  textbook  qualitative  reasoning.  We  next  review  relevant  AI  methods  for 
representing,  constructing,  and  reasoning  with  qualitative  models. 

3.2.1  Qualitative  Process  Theory 

Qualitative  process  (QP)  theory  (Forbus,  1984)  provides  a  vocabulary  for  representing 
mechanisms  of  change.  Under  QP  theory,  only  processes  cause  changes  in  a  physical  system. 

For  our  example  of  pouring  lemonade  in  the  previous  section,  model  fragments  can  represent  the 
contained  fluids  and  the  flow  of  fluid. 

QP  theory  also  includes  causal  relationships  between  quantities.  Direct  influences  are 
relationships  between  quantities  where  a  quantity  (e.g.,  the  rate  of  flow)  increases  or  decreases 
another  (e.g.,  the  volume  of  the  fluid  in  the  source).  Direct  influences  often  exist  between  the 
rate  of  a  process  and  an  affected  quantity,  and  are  represented  by  i+  and  i-  relations  (e.g., 
consequences  of  FluidFlow  in  Figure  6),  which  describe  positive  and  negative  direct 
influences,  respectively.  Indirect  influences  describe  causal  relationships  between  quantities 
where  a  quantity  (e.g.,  the  volume  of  a  container)  causes  a  positive  or  negative  change  in  another 
quantity  (e.g.,  the  pressure  of  the  fluid  therein)  under  a  closed-world  assumption.  Indirect 
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influences  are  represented  by  qprop  and  qrop-  relations  (e.g.,  consequences  of 
ContainedFluid  in  Figure  6),  which  describe  positive  and  negative  indirect  influences, 
respectively.  Qualitative  proportionalities  represent  causal  influences  between  quantities  where 
the  direction  of  change  is  strictly  increasing  or  decreasing. 


ModelFragment  ContainedFluid 
Participants : 

?con  Container  ( containerOf ) 

?sub  StuffType  ( substanceOf ) 

Constraints : 

(physicallyContains  ?con  ?sub) 
Conditions : 

(greaterThan  (Amount  ?sub  ?con)  Zero) 
Consequences : 

(qprop-  (Pressure  ?self)  (Volume  ?con) ) 

ModelFragment  FluidFlow 
Participants : 

?source-con  Container  (outOf-Container ) 
?sink-con  Container  (into-Container) 
?source  ContainedFluid  ( fromLocation) 
?sink  ContainedFluid  (toLocation) 

?path  Path-Generic  (along-Path) 

?sub  StuffType  (substanceOf) 

Constraints : 

(substanceOf  ?source  ?sub) 

(substanceOf  ?sink  ?sub) 

(containerOf  ?source  ?source-con) 
(containerOf  ?sink  ?sink-con) 
(permitsFlow  ?path  ?sub 

?source-con  ?sink-con) 

Conditions : 

(unobstructedPath  ?path) 

(greaterThan  (Pressure  ?source) 
(Pressure  ?sink) ) ) 

Consequences : 

(greaterThan  (Rate  ?self)  Zero) 

(i-  (Volume  ?source)  (Rate  ?self ) ) 

(i+  (Volume  ?sink)  (Rate  ?self) ) 


When  a  container  con  physically  contains 
a  type  of  substance  sub,  a  contained  fluid 
exists.  When  there  is  a  positive  amount  of 
sub  in  con,  the  volume  of  con  negatively 
influences  the  pressure  of  this  contained 
fluid. 


When  two  contained  fluids  -  a  source  and 
a  sink  -  are  connected  by  a  path,  and  both 
are  of  the  same  type  of  substance,  a  fluid 
flow  exists.  When  the  path  is 
unobstructed  and  the  pressure  of  source  is 
greater  than  the  pressure  of  sink,  the  rate 
of  the  flow  is  positive  and  it  decreases  the 
volume  of  source  and  increases  the 
volume  of  sink. 


Figure  6:  ContainedFluid  (above)  and  FluidFlow  (below)  model  fragments  used  in  the  simulation 
in  Chapter  7.  English  interpretations  for  the  model  fragments  included  at  right. 


3.2.2  Compositional  modeling 


In  compositional  modeling  (Falkenhainer  &  Forbus,  1991),  domain  knowledge  is  represented 
using  model  fragments,  which  are  combinable  pieces  of  domain  knowledge.  Modeling  the  flow 
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of  blood  in  the  circulatory  system  (see  Chapter  7  for  detail)  involves  a  number  of  model 
fragments,  two  of  which  are  shown  in  Figure  6:  the  conceptual  model  fragment 
ContainedFluid,  and  the  process  model  fragment  FluidFlow.  Model  fragments  are 
instantiated  during  reasoning.  For  example,  we  might  infer  ContainedFluid  model  fragment 
instances  when  reasoning  about  the  human  circulatory  system  since  each  of  the  chambers  of  the 
heart  contain  blood.  Each  model  fragment  m  can  be  uniquely  defined  by  a  tuple  (P,  C,  A,  N,  S), 
which  includes  participants,  constraints,  assumptions  conditions,  and  consequences,  respectively. 
We  describe  these  using  the  model  fragments  in  Figure  6  as  an  example. 

Participant  statements  (P)  are  statements  describing  the  entities  involved  in  the 
phenomenon.  For  example,  the  ?  con  participant  in  ContainedFluid,  is  of  type  Container, 
so  for  the  entity  heart  to  fill  the  ?con  participant  role,  it  must  be  a  Container,  so  the  statement 
(isa  heart  Container)  must  be  true  for  heart  to  bind  to  ?con.  Each  participant 
statement  is  a  statement  such  as  (isa  ?con  Container)  which  states  that  the  participant  slot 
(e.g.,  ?con)  must  be  of  a  specific  type  (e.g.,  Container).  Participant  slot  ?con  also  has 
relational  role  containerOf,  so  (containerOf  cf  heart)  would  be  true  of  any 
ContainedFluid  instance  cf  where  heart  is  bound  to  ?con. 

Constraints  (C)  are  statements  that  must  hold  over  the  participants  in  order  for  an  instance  of 
the  model  fragment  to  exist.  When  the  constraints  hold,  an  instance  instancefn,  P)  of  model 
fragment  m  is  inferred  as  a  distinct  entity  over  the  participants  P.  For  example,  if 
(physicallyContains  heart  Blood)  is  true  of  Container  instance  heart  and 
Stuff  Type  instance  Blood,  then  a  new  model  fragment  will  be  instantiated  with  participant 
bindings  B  =  {(?con,  heart),  (?sub,  Blood)}.  Logically,  model  fragment  instantiation  can  be 
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expressed  as  the  following  first-order  logical  equivalence,  where  a  conjunction  of  two  sets  of 
statements  is  the  conjunction  of  the  union  of  member  statements: 

P  A  C  =  instance(m,  B ). 

Modeling  assumptions  (A)  are  statements  concerning  the  model  fragment’s  relevance  to  the 
task  at  hand.  These  make  the  granularity,  perspectives,  and  approximations  of  the  model 
fragment  explicit.  These  help  select  the  appropriate  method  of  description  for  problem  solving, 
since  the  behavior  of  a  single  physical  phenomenon  (e.g.,  blood  flow  through  arteries)  can  be 
described  at  multiple  granularities  (e.g.,  describing  fluid  volumes  or  describing  localized 
collections  of  matter  being  transported  through  the  body).  Our  computational  model  does  not 
use  modeling  assumptions  to  simulate  students,  but  we  do  believe  that  students  are  capable  of 
reasoning  at  different  levels  of  description,  and  that  learning  the  appropriate  level  of  description 
for  problem-solving  is  important  for  achieving  expert  understanding.  This  is  future  work. 

Conditions  (N)  are  propositions  that  must  hold  over  a  model  fragment’s  participants  that 
limit  the  model  fragment’s  behavioral  scope,  such  as  (greaterThan  (Amount  ?sub  ?con) 
Zero)  in  ContainedFluid.  Conditions  differ  semantically  from  constraints,  since  an  instance 
of  a  model  fragment  can  exist  without  a  condition  satisfied.  When  all  conditions  of  a  model 
fragment  instance  hold,  the  instance  is  active.  More  formally: 


instance(»z,  B)  AA  AN  =  active(instance(/«,  B )). 
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Consequences  (5)  are  propositions  that  describe  a  model  fragment  instance’s  constraints  on 
a  system’s  behavior  when  it  is  active.  For  example,  the  unground  consequence 

(qprop-  (Pressure  ?self)  (Volume  ?con) ) 

of  ContainedFluid  is  inferred  as 

(qprop-  (Pressure  ch)  (Volume  heart)) 

when  an  instance  ch  is  active  with  participant  bindings  B  =  {(?con,  heart),  (?sub,  Blood)}. 
This  imposes  the  constraint  that  the  pressure  of  the  contained  fluid  ch  increases  as  the  volume  of 
heart  decreases.  Model  fragment  activation  can  be  expressed  as  the  following  logical 
implication: 

active(instance(m,  B))  — >  S. 

Inference  with  model  fragments  can  therefore  be  summarized  with  the  implication 

PaCaAaN—>S. 

Model  fragments  are  instantiated  and  activated  within  a  scenario,  which  is  a  logical  context 
that  contains  a  partial  description  of  the  phenomena  to  be  modeled,  such  as  the  propositional 
facts  and  rules  about  the  solar  system  for  using  the  model  fragments  in  Figure  6.  Model 
fragments  are  stored  within  a  domain  theory,  which  is  a  set  of  model  fragments  and  scenario- 


79 

independent  beliefs.  The  result  of  model  fonnulation  is  a  scenario  model  composed  of  one  or 
more  model  fragment  instances.  Importantly,  one  model  fragment  instance  may  serve  as  a 
participant  of  another  (e.g.,  FluidFlow  in  Figure  6  has  two  ContainedFluid  participants: 
?source  and  ?sink),  so  the  resulting  scenario  model  may  have  a  nested  structure. 

Provided  compositional  models  and  qualitative  process  theory,  what  constitutes  a  “concept” 
in  our  model  of  conceptual  change?  Put  simply,  a  concept  is  domain  knowledge  that  can  be 
learned  and  revised.  We  define  the  three  following  types  of  knowledge  as  concepts: 

•  Model  fragments:  The  model  fragments  in  Figure  6  and  others  (e.g.,  interaction  of 
forces,  floating,  sinking,  fluid  flow,  and  heat  flow)  represent  concepts  because  they  are 
leamable  (see  Chapter  5)  and  revisable  (see  Chapter  8).  As  mentioned  in  Chapter  1, 
model  fragments  represent  parts  of  human  mental  models. 

•  Categories  and  quantities:  Chapter  8  describes  how  the  quantities  within  compositional 
model  fragments  can  be  ontologically  revised  using  heuristics,  so  quantities  such  as 
force,  heat,  and  sunlight  are  also  concepts. 

•  Propositional  beliefs:  Domain-level  propositional  beliefs  about  the  world  are  concepts, 
according  to  the  common  phrase  “the  concept  that  p"  where  p  is  a  proposition  such  as 
“the  earth  orbits  the  sun.”  The  truth  value  of  these  propositions  can  change  in  our 
model.  We  do  not  consider  metaknowledge  propositions  (e.g.,  the  proposition  that  I 
learned  about  the  aorta  from  a  textbook)  to  be  concepts. 

The  term  “concept”  has  obvious  problems  due  to  its  ambiguity,  so  we  refer  to  the  specific 
components  -  model  fragments,  quantities,  and  propositional  beliefs  -  when  possible,  and 
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compactly  use  the  term  “conceptual  knowledge”  or  “concept”  to  refer  to  all  three  types  of 
knowledge  at  once. 

We  must  also  define  the  term  “misconception”  in  the  context  of  our  model.  In  the  literature, 
misconceptions  are  often  stated  as  general  propositions  such  as,  “continuing  motion  implies  a 
continued  force  in  the  direction  of  the  movement”  (Chi,  2005).  In  our  model,  misconceptions  are 
mistakes  produced  by  a  theory  comprised  of  model  fragments,  beliefs,  and  quantities.  For 
example,  in  Simulation  1,  the  qualitative  models  learned  by  the  system  produce  the 
misconceptions  that  (1)  surfaces  do  not  push  up  against  objects  resting  on  their  surface  and  (2) 
objects  pushed  in  a  given  direction  always  go  in  that  direction,  irrespective  of  prior  velocity. 
These  misconceptions  are  exhibited  on  specific  scenarios,  but  we  can  conclude  that  the  system 
would  perform  similarly  on  analogous  scenarios  due  to  the  principles  of  model-based  inference 
described  above. 

3.3  Abductive  reasoning 

Abduction  can  be  defined  as  reasoning  to  the  best  explanation  for  a  set  of  observations  (Peirce, 
1958).  In  AI,  this  has  been  fonnalized  as  a  search  for  some  set  of  assumptions16  that  can  prove 
the  observations,17  where  an  explanation  for  the  observations  is  a  set  of  assumptions  and 
justification  structure  that  together  infer  the  observations.  This  amounts  to  searching  for  the  best 
set  of  assumptions  that  explain  the  observations.  Abduction  has  been  used  in  AI  for  plan 
recognition,  diagnosis,  language  interpretation,  and  other  tasks. 

Systems  that  use  abduction  must  at  least  computationally  implement  a  better  comparator 
between  explanations  so  that  they  can  search  for  the  best  explanation.  Depending  on  the  task, 

16  Assumptions  are  also  referred  to  as  hypotheses  in  the  AI  abduction  literature. 

17  Observations  are  also  referred  to  as  evidence  in  the  AI  abduction  literature. 
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explanatory  preference  might  rely  on  which  explanation  is  more  probable  (e.g.,  Pearl,  1988), 
which  makes  fewer  assumptions  (e.g.,  Ng  &  Mooney,  1992),  or  which  makes  less  costly 
assumptions  (e.g.,  Charniak  &  Shimony,  1990;  Santos,  1994).  Cost-based  abduction  (CBA)  is  of 
particular  relevance  to  this  dissertation,  where  the  goal  is  to  find  a  least-cost  proof  (LCP)  where 
each  assumption  has  a  weighted  cost.  Finding  LCPs  is  NP-Hard  (Charniak  &  Shimony,  1994), 
and  so  is  approximating  LCPs  within  a  fixed  ratio  of  the  optimal  solution  (Abdelbar,  2004). 

Our  model  of  conceptual  change  uses  abductive  reasoning  to  construct  explanations  for  new 
and  previously-encountered  observations.  We  describe  our  abductive  reasoning  algorithm  in 
Chapter  4,  but  it  is  worth  pointing  out  similarities  with  existing  approaches  here.  A  more 
accurate  term  for  our  explanation  construction  process  is  abductive  model  formulation  since  our 
model  uses  qualitative  model  fragments  to  represent  domain  knowledge  and  composes  them  into 
a  scenario  model  via  model  formulation,  described  above.  The  explanation  evaluation  process  - 
whereby  the  agent  determines  the  best  explanation  -  is  similar  to  CBA,  but  differs  in  two 
important  ways  to  model  humans:  (1)  consistency  is  a  soft  constraint  (i.e.,  contradictions  are 
pennitted  but  costly)  within  and  across  explanations;  and  (2)  more  than  just  assumptions  have  a 
cost,  e.g.,  model  fragments,  model  fragment  instances,  contradictions,  and  other  elements.  In 
CBA,  individual  assumptions  have  weighted  costs,  but  in  our  model,  some  sets  of  beliefs  (e.g., 
those  comprising  a  logical  contradiction)  also  have  costs. 

3.4  Analogical  processing 

Two  simulations  described  in  this  thesis  utilize  analogical  reasoning.  This  involves  matching  the 
relations  and  entities  among  two  cases  to  make  similarity  judgments,  generalizations,  and 
inferences.  We  briefly  review  these  analogical  subsystems  next. 
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3.4.1  The  Structure-Mapping  Engine 

The  Structure-Mapping  Engine  (SME)  (Falkenhainer  et  al.,  1989)  is  a  domain-general 
computational  model  of  analogy  and  similarity,  based  on  Gentner’s  (1983)  structure-mapping 
theory  of  analogy.  Its  inputs  are  two  cases,  the  base  and  target,  consisting  of  structured 
representational  statements.  SME  computes  one  or  more  mappings  between  the  base  and  the 
target.  Each  mapping  contains  (1)  correspondences  that  match  expressions  and  entities  in  the 
base  with  expressions  and  entities  in  the  target,  (2)  a  numerical  structural  evaluation  score  of  the 
quality  of  the  mapping,  and  (3)  candidate  inferences  that  assert  what  might  hold18  in  the  target. 
Candidate  inferences  may  not  be  deductively  valid,  but  they  may  produce  useful  hypotheses 
(e.g.,  Gentner,  1989;  McLure  et  al.,  2010;  Christie  &  Gentner,  2010).  We  will  refer  to  the 
following  functions  of  SME  in  the  below: 

•  best-mapping(b,  t):  returns  the  SME  mapping  with  the  highest  structural  evaluation 
score,  using  base  b  and  target  t  cases  as  input. 

The  SME  structural  evaluation  score  can  be  nonnalized  by  dividing  it  by  the  maximum  self¬ 
score,  (i.e.,  the  maximum  score  attained  by  matching  either  the  base  or  target  to  itself).  This 
ensures  that  0  <  normalized  score  <1.  We  use  the  following  functions  to  refer  to  structural 
evaluation  scores: 

•  sim-score(m ):  returns  the  numerical  structural  evaluation  score  of  a  SME  mapping  in . 

ls  Since  they  are  the  product  of  structural  similarity  alone,  candidate  inferences  are  not  necessarily  deductively  valid; 
however,  they  are  useful  hypotheses  (e.g.,  Gentner,  1989;  McLure  et  al.,  2010;  Christie  &  Gentner,  2010). 
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•  self-score(c):  returns  the  numerical  structural  evaluation  score  of  a  SME  mapping 
between  a  case  c  and  itself.  Computed  as  sim-score(best-mapping(c,  c)). 

•  nonn-score(m):  returns  a  normalized  structural  evaluation  score  s,  such  that  0  <  ,v  <  1, 
for  SME  mapping  m  with  base  m.base  and  target  m.  target.  Computed  as: 

sim-score(m) 

max(s elf —score (m.base),  self —score(m.tar  get))' 

3.4.2  MAC/FAC 

MAC/FAC  (Forbus  et  ah,  1995)  is  a  domain-general  computational  model  of  similarity-based 
retrieval.  Its  inputs  are  (1)  a  probe  case  and  (2)  a  case  library  (set  of  cases).  Cases  consist  of 
structured,  relational  statements,  like  the  inputs  to  SME.  MAC/FAC  retrieves  one  or  more  cases 
from  the  case  library  that  are  similar  to  the  probe  via  a  two-stage  filtering  process.  The  first 
stage  is  coarse,  using  a  vector  representation  automatically  computed  from  the  cases  to  estimate 
similarity  between  the  probe  and  the  contents  of  the  case  library  by  computing  dot  products  in 
parallel.  It  returns  the  case  library  case  with  the  highest  dot  product,  plus  up  to  two  others,  if 
sufficiently  close.  The  second  stage  uses  SME  to  compare  the  probe  with  the  cases  returned  by 
the  first  stage.  It  returns  the  case  with  the  highest  similarity  score,  plus  up  to  two  others,  if 
sufficiently  close.  The  mappings  it  computes  are  available  for  subsequent  processing.  We  use 
the  following  functions  to  describe  MAC/FAC  retrieval: 

•  macfacip,  C)\  given  a  probe  case  p  and  a  case  library  C,  returns  an  ordered  sequence  M 
of  mappings  retrieved  via  MAC/FAC,  where  0  <  |M|  <  3.  Sequence  M  is  ordered  such 
that  sim-score{iiii )  >  sim-score(nti+ 1),  so  the  most  similar  MAC/FAC  retrieval  is  mo,  and 


the  most  similar  case  is  m0.  target. 
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•  macfac-best{p,  C):  returns  the  first  element  (highest-similarity  mapping)  of  macfacip,  C). 


3.4.3  SAGE 

SAGE  (Friedman  et  al.,  in  preparation)  is  a  computational  model  of  analogical  generalization 
that  uses  both  SME  and  MAC/FAC.  SAGE  clusters  similar  examples  into  probabilistic 
generalizations,  where  each  generalization  typically  describes  a  different  higher-order  relational 
structure.  SAGE  takes  a  sequence  of  positive  examples  E  =  (eo,  e„ )  represented  as  cases,  and 

a  numerical  similarity  threshold  s  (0  <  ,v  <  1)  as  its  inputs.  SAGE  produces  (1)  a  set  of 
generalizations  G  =  {go,  . ..,  g,},  each  of  which  is  a  probabilistic  case  created  by  merging  similar 
examples  in  E,  and  (2)  a  set  of  ungeneralized  examples  U  =  {uo,  ...,  Uj)  Q  E,  that  were  not 
sufficiently  similar  to  other  examples  to  generalize. 

SAGE  is  initialized  with  G  =  U  =  0.  When  given  a  new  example  e,  G  E,  SAGE  calls 
macfac-best(ej,  G  U  U)  to  find  the  best  mapping  m  between  e,  and  an  existing  generalization  or 
ungeneralized  example.  If  there  is  no  such  mapping  or  the  mapping  is  below  the  similarity 
threshold  (i.e.,  norm-score(m)  <  5)  then  the  new  example  is  added  to  the  list  of  ungeneralized 
exemplars  (i.e.,  U=  U  +  el)  and  the  algorithm  terminates.  Otherwise,  SAGE  merges  <?,  and  the 
case  that  was  retrieved  via  MAC/FAC.  The  merge  happens  differently  depending  on  whether 
MAC/FAC  retrieved  an  ungeneralized  example  or  a  generalization.  If  the  retrieved  case  is  an 
ungeneralized  example  u  then  (1)  a  new  generalization  g  is  created  by  merging  e,  with  u,  (2)  the 
size  of  g  is  set  to  two  (i.e.,  \g\  =  2),  (3)  g  is  added  to  G,  and  (4)  the  u  is  removed  from  U.  If 
MAC/FAC  retrieved  an  existing  generalization  g,  then  e,  is  merged  into  g,  and  the  size  of  g  is 
incremented  by  1  (i.e.,  |g|  =  |g|  +  1). 
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When  SAGE  merges  a  new  example  e  with  a  previous  case  c  (i.e.,  a  previous  example  or 
generalization),  it  records  a  probability  for  each  statement  to  represent  its  frequency  within  the 
resulting  generalization  (Halstead  &  Forbus,  2005).  The  probability  of  a  statement  s  within  the 
resulting  generalization  g  is  a  factor  of  (1)  the  probabilities  of  s  in  e  and  c  and  (2)  the  size  of  c, 
written  as  \c\.  If  c  is  an  ungeneralized  example,  \c\  equals  1;  otherwise,  \c\  is  the  number  of  cases 
that  has  been  merged  into  the  generalization  c.  We  compute  the  probability  of  any  statement  s  in 
the  resulting  generalization  g  as  follows: 


P(s  in  g ) 


P(s  in  c)|c|+P(s  in  e) 
|c|+l 


Note  that  any  statement  s  not  present  in  e  or  c  has  probability  P(s  in  e)  =  0  and  P(.v  in  c)  =  0, 
respectively.  For  the  case  where  two  examples  are  merged  into  a  new  generalization,  all 
statements  with  correspondences  in  the  mapping  are  inferred  with  a  probability  of  1 .0,  and  all 
expressions  without  correspondences  in  the  mapping  are  inferred  with  a  probability  of  0.5. 

Using  this  merge  technique,  common  relational  structure  is  preserved  with  a  probability  of 
1.0,  and  non-overlapping  structure  is  still  recorded,  but  with  a  lower  probability.  The  probability 
affects  similarity  judgments  in  SAGE.  This  is  because  the  individual  similarity  score  of  each 
SME  match  hypothesis  is  weighted  by  the  probability  of  the  corresponding  base  and  target 
statements.  Consequently,  low-probability  expressions  in  a  generalization  contribute  less  to 
similarity  judgments. 

SAGE  has  been  used  for  concept  learning  in  domains  such  as  sketch  recognition,  spatial 
prepositions,  and  clustering  short  stories  (all  three  in  Friedman  et  ah,  2011),  as  well  as  for 
learning  sentence  structure  from  example  sentences  describing  events  (Taylor  et  ah,  2011). 
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3.5  Truth  Maintenance  Systems 

A  Truth  Maintenance  System  (TMS)  communicates  with  an  inference  engine  to  track  the 
justifications  for  beliefs  (Forbus  &  de  Kleer,  1993).  Tracking  the  justifications  for  beliefs 
improves  problem-solving  efficiency  in  three  ways  relevant  to  our  conceptual  change  model: 


Figure  7:  A  TMS  containing  assumptions  (squares),  justified  beliefs  (ovals),  justifications 
(triangles),  and  a  contradiction  1  node  (courtesy  Forbus  &  de  Kleer,  1993) 

1 .  Explanations  can  be  generated  via  a  justification  trace. 

2.  The  system  can  identify  the  faulty  foundations  -  including  assumptions  -  of  a  bad 
conclusion. 

3.  Caching  inferences  by  retaining  them  in  justification  structure  is  generally  more  efficient 
than  re-running  the  inference  process  all  over  again. 19 


19  If  inference  rules  are  few  and  inexpensive  to  run,  caching  inferences  may  actually  degrade  performance. 
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Specialized  types  of  TMSs  exist,  but  our  model  of  conceptual  change  uses  a  JTMS 
(justification-based  TMS),  so  we  only  review  the  details  relevant  to  JTMSs.  For  our  purposes,  a 
TMS  includes  a  network  of  belief  nodes  that  represent  distinct  beliefs  and  justifications  which 
associate  zero  or  more  antecedent  belief  nodes  with  a  consequent  belief  node.  There  are 
different  types  of  belief  nodes,  three  of  which  are  shown  in  the  example  TMS  in  Figure  7: 

1 .  A  premise  node  represents  a  belief  that  holds  universally. 

2.  An  assumption  node  represents  a  belief  that  can  be  explicitly  enabled  (believed)  or 
retracted  (disbelieved)  by  the  agent. 

3.  A  nonnal  belief  node  represents  a  belief  that  is  believable  iff  it  is  justified  by  other 
beliefs. 

4.  A  contradiction  represents  a  logical  inconsistency  within  the  justifying  beliefs.  For 
example,  in  Figure  7,  belief  node  g  supports  a  contradiction,  which  is  supported  by 
assumptions  A  and  C,  so  at  least  one  of  A  and  C  is  faulty.  For  the  sake  of  conserving 
existing  beliefs,  assumption  A  may  be  retracted  to  avoid  retracting  support  for  h. 

In  a  TMS,  multiple  justifications  can  justify  a  single  belief  node.  This  indicates  that  the 
belief  has  more  than  one  unique  line  of  reasoning  for  believing  it.  Suppose  we  want  to  find  an 
explanation  for  a  belief  in  the  TMS  for  abductive  reasoning.  Explanations  for  a  belief  node  n  in 
a  TMS  are  based  on  well-founded  support  (Forbus  &  de  Kleer,  1993)  for  that  node.  Well- 
founded  support  is  any  sequence  of  justifications  Ji  ...  Jk  such  that: 


Node  n  is  justified  by  Jk. 
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•  All  antecedents  of  Jk  are  justified  earlier  in  the  sequence. 

•  No  belief  node  has  more  than  one  justification  in  the  sequence. 

In  Figure  7,  h  has  well-founded  support  from  its  supporting  justification,  provided  assumptions  C 
and  E  are  enabled.  The  contradiction  has  well-founded  support  from  its  supporting  justification 
and  the  justification  supporting  g,  provided  A  and  C  are  enabled.  If  A  is  retracted,  both  the 
contradiction  and  g  will  lose  all  well-founded  support.  In  this  thesis,  we  call  each  set  of  possible 
well-founded  support  a  well-founded  explanation.  Importantly,  when  a  belief  n  is  justified  by 
two  beliefs,  it  has  at  least  two  well-founded  explanations,  and  it  may  have  an  exponential 
number  of  them. 

TMS  justification  structure  is  used  within  our  conceptual  change  model  to  track  the  rationale 
for  beliefs.  The  definition  of  well-founded  explanations  dictates  how  the  justification  structure  is 
aggregated  into  different  explanations  in  our  model.  We  discuss  this  further  in  Chapter  4. 

3.6  Microtheory  contextualization 

Conceptual  learning  at  the  scale  we  advocate  in  this  thesis  requires  a  large  knowledge  base  (KB) 
-  both  quantitatively,  in  the  number  of  different  facts,  and  qualitatively,  in  the  number  of 
different  predicates  and  entities.  As  the  knowledge  base  grows,  storing  all  propositional  beliefs, 
rules,  and  mental  models  in  a  single  logical  context  would  quickly  make  reasoning  intractable. 

In  many  learning  systems,  the  control  knowledge  that  initially  speeds  up  learning  and  reasoning 
eventually  degrades  performance.  This  has  been  called  the  utility  problem  (Minton,  1990). 
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Aside  from  tractability  issues,  conceptual  change  involves  reasoning  with  competing, 
potentially  inconsistent  knowledge.  This  requires  the  use  multiple  logical  contexts.20 
Representing  inconsistent  explanations  requires  representing  inconsistent  beliefs,  and  when  this 
occurs  within  the  same  logical  context,  it  entails  a  contradiction.  A  contradiction  within  a  logical 
context  entails  any  belief  via  indirect  proof  -  for  AI  systems,  but  not  necessarily  for  people  - 
which  is  problematic  for  reasoning  about  the  state  of  the  world. 

Intractability  can  be  mitigated  and  inconsistency  can  be  tolerated  by  contextualizing  the  KB 
into  hierarchical  logical  contexts  that  we  call  microtheories.  Microtheories  are  hierarchical 
because  a  microtheory  mchM  can  inherit  from  another  microtheory  mparent,  so  that  all  statements  in 
m parent  are  visible  in  mchiid-  This  allows  us  to  quickly  define  logical  contexts  for  reasoning 
without  copying  propositional  beliefs.  Contextualizing  large  KBs  is  not  a  new  idea  -  there  exist 
algorithms  for  automatically  creating  KB  partitions  (e.g.,  Amir  &  Mcllraith,  2005)  and  for 
performing  model  formulation  in  a  micro  theory-contextualized  KB  (Forbus,  2010). 

In  the  system  described  below,  each  microtheory  in  the  KB  contains  zero  or  more  relational 
statements,  and  each  relational  statement  in  the  KB  belongs  to  one  or  more  micro  theories. 
Microtheories  are  ubiquitous  in  the  system  described  here:  explanations  are  represented,  in  part, 
by  microtheories;  SME  cases  and  SAGE  generalizations  are  microtheories;  model  fonnulation 
uses  microtheories  for  scenario  descriptions,  scenario  models,  and  domain  theories;  and  the 
entire  explanation-based  network  described  below  is  encoded  as  relational  statements  across 
several  micro  theories. 


20  Temporal  and  logical  qualification  predicates,  (e.g.,  OpenCyc’s  binary  holdsln  relation)  can  be  used  to 
contextualize  propositional  beliefs  within  the  same  logical  context  so  as  to  avoid  entailing  a  contradiction;  however, 
this  is  not  necessarily  the  case  for  contextualizing  rules,  plans,  and  model  fragments. 
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3.7  Metareasoning 

As  discussed  above,  reasoning  with  conceptual  knowledge  produces  explanations  about  the 
world.  But  the  process  of  conceptual  change  requires  reasoning  about  the  conceptual  knowledge 
and  about  the  explanations  produced,  to  determine  which  beliefs  are  more  productive  and  which 
explanations  better  suit  the  observations.  We  can  therefore  draw  a  distinction  between  (1) 
object-level  reasoning  with  domain  knowledge  and  (2)  meta-level  reasoning  about  object-level 
reasoning.  Figure  8  illustrates  both  control  and  monitoring  from  the  meta-level.  In  AI, 
metareasoning  is  the  deliberation  over  plans  and  strategies  available  to  an  agent,  and  then 
selecting  a  course  of  action  (Horvitz,  1988;  Russell  &  Wefald,  1991;  Cox,  2005).  Since 
metareasoning  can  observe  object-level  operations,  it  can  also  be  used  for  explaining  these 
operations  (e.g.,  Kennedy,  2008)  and  doing  introspective  learning  (e.g.,  Leake  &  Wilson,  2008). 
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Figure  8:  Meta-level  control  and  monitoring  (Cox  &  Raja,  2007) 

In  our  model  of  conceptual  change,  meta-level  monitoring  tasks  include  evaluating  explanations 

(the  product  of  object-level  reasoning)  and  detecting  anomalies  within  observations.  Meta-level 
con  trol  tasks  include  (1)  heuristic-based  revision  of  knowledge  and  (2)  preference  encoding  over 
concepts  and  explanations,  both  of  which  influence  future  object-level  reasoning. 


Knowledge  can  also  be  encoded  at  the  meta-level.  In  our  computational  model,  this 
includes  knowledge  about  domain  knowledge,  such  as  (1)  an  explicit  preference  for  one  meaning 
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of force  over  another,  (2)  knowledge  that  the  anatomical  concept  LeftVentricle  was  learned 
from  a  textbook,  (3)  knowledge  that  two  explanations  for  the  changing  of  the  seasons  are  in 
competition,  and  so-forth.  This  metaknowledge  aids  in  making  decisions  for  future  learning  and 
reasoning. 

3.8  CogSketch 

CogSketch  (Forbus  et  ah,  2008)  is  an  open-domain  sketching  system.  CogSketch  interprets 
the  ink  drawn  by  the  user,  and  computes  spatial  and  positional  relations  (e.g.,  above,  rightOf , 
touches)  between  objects.  Further,  CogSketch  supports  multiple  subsketches  within  a  single 
sketch.  We  use  this  feature  to  create  comic  graphs  (e.g.,  Figure  9)  that  serve  as  stimuli,  where 
each  subsketch  in  a  stimulus  represents  a  different  qualitative  state,  and  transitions  between  them 


startsAfterEndingOf  startsAfterEndingOf 


- 7Z 

^ - 

_ _ 

Pre-13 

Push- 13 

Move- 13 

Figure  9:  A  comic  graph  stimulus  created  using 
CogSketch. 

represent  state  changes.  Similar  stimuli  have  been  used  in  analogical  learning  experiments  with 
people  (e.g.,  Chen,  1995;  2002). 

Figure  9  depicts  a  stimulus  from  the  simulation  in  Chapter  5.  Each  subsketch  represents  a 
change  in  the  physical  system  illustrated.  Within  each  subsketch,  CogSketch  automatically 
encodes  qualitative  spatial  relationships  between  the  entities  depicted,  using  positional  and 
topological  relationships.  For  example,  the  person  in  Figure  9  is  above  and  touching  the 
ground  in  all  three  states,  but  the  person  and  the  toy  truck  are  not  touching  in  the  third  state. 
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Physical  quantities  such  as  area  and  axis  coordinates  are  also  computed  by  CogSketch  and  stored 
using  relations  and  scalar  quantities.  For  example,  the  statement 

(positionAlongAxis  truck-4  Horizontal  (Inches  220)) 

asserts  that  entity  truck- 4  is  220  inches  to  the  right  of  the  origin  along  the  Horizontal  axis. 
The  arrows  within  a  subsketch  (e.g.,  the  blue  and  green  arrows  in  Figure  9)  are  user-generated 
annotations  between  objects,  which  represent  relationships  such  as  applied  force  (blue  arrows) 
and  movement  (green  arrows).  The  arrows  between  subsketches  indicate  temporal  order,  via  the 
startsAfterEndingOf  relation.  Using  quantity  data,  annotations,  and  temporal  relations,  the 
system  can  identify  changes  in  physical  quantities  across  states,  which  we  refer  to  as  physical 
behaviors.  CogSketch  is  used  to  encode  physical  behaviors  comprising  the  training  and  testing 
data  for  two  of  the  simulations  presented  below.  Since  CogSketch  automatically  encodes  the 
knowledge  for  these  simulations,  the  knowledge  representation  choices  were  not  made  with  the 
learning  task  in  mind,  so  the  stimuli  were  not  hand-tailored  for  the  specific  learning  tasks. 

3.8.1  Psychological  assumptions  about  using  comic  graphs 

Although  it  is  not  background  material  per  se,  it  is  fitting  to  discuss  the  psychological 
assumptions  we  make  by  using  sketched  comic  graphs  as  testing  and  training  data.  We  begin  by 
describing  how  the  simulations  in  this  dissertation  use  sketches  for  testing  and  training. 

Experimenters  in  cognitive  psychology  and  learning  science  frequently  use  multi-state 
sketches  (like  Figure  9)  to  describe  a  phenomenon  occurring  over  time  and  then  ask  the  subject 
for  predictions  or  explanations  (e.g.,  Hestenes  et  al.,  1992;  Chen,  1995;  2002).  Other 
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experimenters  use  sketches  such  as  Figure  10  and  ask  the  subject  to  compare  two  scenarios 
(Ioannides  &  Vosniadou,  2002;  diSessa  et  al.,  2004).  We  refer  to  these  sketched  testing  data. 
The  simulations  in  Chapter  5  and  Chapter  8  use  the  same  sketched  testing  data  as  experimenters, 
redrawn  by  hand  in  CogSketch  to  be  automatically  encoded  into  relational  knowledge  for  use  by 


Figure  10:  A  sketch  with  two  subsketches, 
redrawn  from  diSessa  et  al.  (2004). 

the  simulation.  Using  sketched  testing  data  with  CogSketch  makes  several  assumptions  about 
how  people  encode  sketched  knowledge,  which  we  discuss  below. 

The  simulations  in  this  dissertation  also  use  sketches  for  learning.  For  example,  the  sketch 
in  Figure  9  is  used  by  the  simulation  in  Chapter  5  to  leam  humanlike  preconceptions  of  pushing, 
moving,  and  blocking.  This  use  of  sketched  training  data  is  very  different  from  sketched  testing 
data.  We  list  five  considerations  that  arise  from  our  choice  of  using  comic  graphs  as  learning 
stimuli: 

1.  Real-life  observations  are  represented  as  independent  comic  graph  episodes.  As 
inhabitants  of  a  continuous  world,  people  must  learn  when  a  jumping  event  starts  and 
ends,  rather  than  being  told  the  relevant  start  and  end  state  in  a  comic  graph.  Since  we 
provide  the  system  with  clear-cut  cases  such  as  Figure  9,  we  do  not  expose  the  system  to 
distracting  qualitative  states  that  might  occur  before  or  after  the  event. 

2.  Observations  in  a  continuous  world  are  approximated  by  a  sequence  of  still  pictures. 
The  simulations  are  not  observing  a  world  of  continuous  -  and  continuously  changing  - 


94 


physical  quantities.  Instead,  they  are  given  CogSketch’s  output:  qualitative  spatial 
relations  over  objects  and  numerical  values  of  spatial  quantities.  The  sketched  data 
therefore  conveys  relative  changes  in  position,  but  not  relative  changes  in  velocity,  so 
the  simulation  does  not  need  to  differentiate  velocity  from  acceleration,  which  is  difficult 
for  novice  students  (Dykstra  et  ah,  1992). 

3.  The  sequence  of  events  is  already  segmented  into  different  qualitative  states.  The 
simulations  do  not  have  to  find  the  often-fuzzy  boundaries  between  physical  behaviors 
as  an  event  unfolds  over  time.  In  the  Figure  9  example,  the  person  pushes  the  truck,  then 
the  truck  and  car  move,  and  then  the  truck  and  car  stop  -  there  is  no  temporal  ambiguity 
in  this  chain  of  events. 

4.  The  objects  and  events  in  the  stimuli  are  relevant  to  the  concept  being  learned.  This  is  a 
factor  of  the  sparseness  of  the  sketches  -  they  contain  few  confusing  events,  e.g.,  a 
dozen  birds  flying  overhead,  a  broken  wheel  on  a  toy  truck,  and  so  forth.  As  a  result, 
there  are  less  confounds  for  inferring  causality  between  events. 

5.  The  encoding  in  the  stimuli  is  relevant  to  the  concept  being  learned.  The  encoding  is 
sparse  in  that  CogSketch  does  not  encode  knowledge  about  the  internal  components 
individual  glyphs,  e.g.,  that  the  head  of  the  person  in  Figure  9  is  an  oval  with  a  major 
axis  angle  of  39  degrees.  Consequently,  the  qualitative  relations  produced  by  CogSketch 
comprise  the  majority  of  the  encoding,  and  these  are  especially  relevant  for  learning  a 
qualitative  theory  of  dynamics  (e.g.,  Chapters  5  and  8).  The  output  of  CogSketch  is  not 
nearly  as  rich  as  human  visual  perception;  however,  we  do  believe  that  CogSketch 
captures  an  important  subset  spatial  knowledge  that  people  encode.  This  is  not  to  say 
that  the  sketches  contain  no  extraneous  data;  they  contain  entity  attributes  (e.g.,  Truck) 
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and  extraneous  conditions  (e,g.,  the  truck  in  Figure  9  is  touching  the  car  while  it  moves, 
which  is  not  a  necessary  condition  for  movement)  that  must  be  factored  out  by  learning 
algorithms. 

All  of  these  consequences  of  our  sketched-based  approximation  of  the  real  world  are  reasons 
to  expect  our  simulations  to  learn  real-world  concepts  much  faster  than  people.  Despite  the 
differences  between  comic  graphs  and  the  real  world,  we  believe  that  using  automatically- 
generated  training  data  is  a  significant  advance  over  using  hand-coded  stimuli  to  simulate  real- 
world  experiences.  We  discuss  more  specific  implications  of  these  representation  choices  in  the 
simulation  chapters,  where  relevant. 
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Chapter  4:  A  Computational  Model  of  Explanation-Based  Conceptual  Change 

This  chapter  describes  our  computational  model  of  conceptual  change.  Except  for  the  AI 
technologies  discussed  in  Chapter  3,  the  computational  model  described  in  this  chapter  is  a  novel 
contribution  of  this  dissertation.  We  describe  how  knowledge  is  contextualized  using 
explanations  and  how  constructing  and  evaluating  explanations  affects  the  knowledge  of  the 
agent.  This  is  provides  the  explanatory  power  of  the  system,  and  is  especially  relevant  to  the 
third  claim  of  this  dissertation: 

Claim  3:  Human  mental  model  transformation  and  category  revision  can  both  be  modeled 
by  iteratively  (1)  constructing  explanations  and  (2)  using  meta-level  reasoning  to  select 
among  competing  explanations  and  revise  domain  knowledge. 

The  core  of  our  model  includes  the  following:  (1)  a  network  for  organizing  knowledge;  (2)  an 
abductive  algorithm  for  constructing  explanations  in  the  network;  (3)  meta-level  strategies  for 
selecting  a  preferred  explanation;  and  (4)  strategies  for  retrospectively  explaining  previously- 
encountered  phenomena.  This  core  model  satisfies  Claim  3. 

After  we  describe  how  knowledge  is  organized,  we  describe  the  specifics  of  how 
explanations  are  constructed,  retrieved,  and  reused.  We  then  describe  how  preferences  are 
computed  over  explanations,  which  drives  the  adoption  and  propagation  of  new  infonnation. 

4.1  Two  micro-examples  of  conceptual  change 

We  consider  the  following  two  micro-examples  of  conceptual  change  in  the  remainder  of  this 
chapter: 


97 


1.  Circulatory  system  example  (mental  model  transfonnation,  from  Chapter  7):  The 
agent’s  mental  model  of  the  circulatory  system  involves  a  loop  from  a  single-chamber 
model  of  the  heart  to  the  body  and  back.  After  incorporating  knowledge  from  a 
textbook,  the  agent  revises  its  mental  model  so  that  (1)  the  heart  is  divided  into  left  and 
right  sides  and  (2)  blood  flows  to  the  body  from  the  left  side  of  the  heart. 

2.  Force  example  (category  revision,  from  Chapter  8):  The  agent  uses  a  force-like  quantity 
q  that  is  present  in  all  objects.  The  agent  cannot  explain  why  a  small  ball  travels  farther 
than  a  large  ball  when  struck  by  the  same  foot  using  its  present  concept  of  q. 
Consequently,  the  agent  revises  q  so  that  it  is  transferrable  between  colliding  objects, 
where  the  amount  transferred  is  qualitatively  inversely  proportional  to  the  size  of  the 
struck  object. 

These  two  examples  are  not  isolated  changes;  they  are  part  of  larger  model  transformations 
(e.g.,  that  the  blood  from  the  body  flows  to  the  right  side  of  the  heart  and  is  then  pumped  to  the 
lungs)  and  trajectories  of  change  (e.g.,  that  forces  exist  between,  and  not  within,  objects)  in  their 
respective  simulations.  But  for  ease  of  explanation  in  this  chapter,  here  we  consider  them  in 
isolation.  Both  types  of  conceptual  change  use  the  same  core  explanation-based  framework 
described  here,  but  category  revision  requires  some  additional  operations.  For  instance,  the 
category  revision  simulation  in  Chapter  8  uses  heuristics  to  revise  a  quantity  in  the  domain 
theory.  We  discuss  operations  specific  to  category  revision  in  Chapter  8. 
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4.2  Contextualizing  knowledge  for  conceptual  change 

Conceptual  change  involves  managing  inconsistent  knowledge.  The  agent  must  encode  beliefs 
and  models  that  are  inconsistent  with  prior  knowledge,  use  them  to  reason  about  the  world,  and 
then  determine  which  of  the  available  beliefs  and  models  provide  the  best  (i.e.,  simplest,  most 
accurate,  and  most  credible)  account.  As  we  discussed  in  Chapter  3,  we  can  divide  knowledge 
into  logical  microtheories  to  retain  local  consistency  where  it’s  important.  Our  model  uses 
microtheories  (1)  as  sets  of  beliefs  and  model  fragments  and  (2)  as  cases  for  analogical 
reasoning.  We  begin  by  discussing  how  microtheories  are  used  to  contextualize  different  types 
of  information. 

Recall  the  following  from  our  compositional  modeling  discussion  in  Chapter  3:  (1)  a 
scenario  is  a  set  of  statements  that  describes  a  problem;  (2)  the  domain  theory  is  a  set  of 
scenario-independent  model  fragments  and  statements;  and  (3)  model  formulation  is  the  process 
of  constructing  a  model  of  the  scenario  from  elements  of  the  domain  theory.  It  is  important  for 
the  agent  to  have  record  of  what  information  was  gathered  from  an  external  scenario  (e.g.,  via 
observation  or  reading)  and  what  was  inferred  via  model  formulation.  This  is  achieved  by 
representing  each  scenario  as  its  own  scenario  microtheory .21  In  the  circulatory  system  micro¬ 
example,  multiple  scenario  microtheories  contain  the  infonnation  from  the  textbook,  and  in  the 
force  example,  two  scenario  microtheories  contain  the  infonnation  about  two  observations:  a 
foot  kicking  a  small  ball;  and  the  same  foot  kicking  a  large  ball.  Each  scenario  microtheory  is 
annotated  with  metaknowledge  (defined  in  Chapter  3)  that  records  the  source  of  the  infonnation 
(e.g.,  observation,  textbook,  or  interaction  with  another  individual). 


21 


See  section  3.6  for  a  discussion  of  microtheories. 


99 


Some  beliefs  in  a  scenario  microtheory  describe  processes,  states,  and  events  that  the  agent 
must  explain.  Following  Hempel  and  Oppenheim  (1948),  we  call  these  explanandums .  Consider 
the  circulatory  system  micro-example  above:  the  agent  encounters  information  from  a  textbook 
that  the  (1)  heart  is  divided  into  two  sides  and  (2)  that  blood  is  pumped  from  the  left  side  to  the 
body.  This  textbook-based  scenario  microtheory  contains  propositional  beliefs  describing 
objects  such  as 

(isa  1-heart  (LeftRegionFn  Heart) ) 

which  states  the  symbol  1-heart  is  an  instance  of  (LeftRegionFn  Heart)  .  It  also  includes 
beliefs  that  together  describe  a  single  situation,  such  as 

(isa  leftH2B  PhysicalTransf er ) 

(outOf-Container  leftH2B  1-heart) 

(into-Container  leftH2B  body) 

(substanceOf  leftH2B  Blood) 

which  describes  lef  tH2B,  the  flow  of  blood  from  1-heart  to  the  body.  The  four  propositional 
beliefs  describing  the  flow  event  leftH2B  constitutes  a  single  explanandum.  When  a  new 
explanandum  is  encountered  in  a  scenario,  it  is  explained  via  model  fonnulation. 

When  the  agent  encounters  a  new  scenario  such  as  the  textbook  information  above,  the 
scenario  microtheory  is  added  as  a  parent  of  the  domain  knowledge  microtheory  D.  Recall  that 
when  a  microtheory  is  the  parent  of  another,  its  statements  are  inherited  by  the  child 
microtheory.  ID)  thereby  inherits  all  infonnation  from  observations,  interactions,  and  instruction 
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that  the  agent  has  encountered.  In  addition  to  inheriting  from  scenarios,  ID)  also  contains  model 
fragments  that  have  been  induced  from  observations  (e.g.,  via  SAGE  in  Chapter  5).  Importantly, 
information  in  one  scenario  microtheory  may  contradict  information  in  another  scenario 
microtheory,  so  the  information  in  D  may  be  inconsistent  (i.e.,  its  conjunction  could  entail  a 
logical  contradiction).  Propositional  beliefs  in  ID)  may  serve  as  premises.* 

When  the  agent  constructs  an  explanation  via  model  formulation,  it  uses  subsets  of  ID)  as  the 
domain  theory  and  the  scenario  since  ID)  inherits  scenario  information  and  contains  model 
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(Amount  blood  heart)  (active  mfiO) 


Figure  11:  A  small  portion  of  justification  structure  generated  from  model  formulation  in  the  circulatory 
system  micro-example.  The  justification  (triangle)  at  left  is  the  logical  instantiation  of  model  fragment 
instance  mf  iO  based  on  the  constraints  of  ContainedFluid  (see  Figure  6  for  ContainedFluid 
definition)  and  the  justification  at  right  is  the  logical  activation  ofmf  iO. 

fragments.  The  output  of  model  formulation  includes  (1)  statements  that  are  logically  entailed 
by  instantiating  and  activating  model  fragments,  (2)  assumptions*  that  justify  other  beliefs,  but 
have  no  justification  themselves,  and  (3)  justifications*  that  associate  antecedent  and  consequent 
statements.  Figure  1 1  shows  some  justification  structure  resulting  from  model  formulation  in  the 
circulatory  system  micro-example.  Some  belief  nodes  in  Figure  11,  e.g.,  ( contains  heart 
blood) ,  describe  the  specific  structure  of  the  circulatory  system.  These  are  in  ID)  and  inherited 


This  term  is  defined  in  section  3.5. 
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from  scenario  microtheories.  The  belief  (isa  ContainedFluid  ModelFragment)  in  Figure 
1 1  refers  to  the  model  fragment  ContainedFluid  which  is  also  present  in  D.  Other  belief 
nodes  in  Figure  1 1  (e.g.,  (isa  mfiO  ContainedFluid),  (containerOf  mfiO  heart), 
and  (active  mf  iO )  )  describe  the  scenario  model.  These  beliefs  are  not  visible  from  D.  They 
are  stored  in  the  provisional  belief  microtheory  B  which  contains  beliefs  generated  via  model 
formulation. 

The  distinction  between  D  and  B  is  that  ID)  includes  assertions  about  the  world  (e.g., 
(contains  heart  blood)  :  “ the  heart  contains  blood")  and  models  for  reasoning  about  the 
world  (e.g.  ContainedFluid).  In  compositional  modeling,  you  would  find  this  information  in 
scenarios  and  domain  theories,  respectively.  B  contains  the  inferences  (e.g.,  (containerOf 
mfiO  heart)  :  “The  container  of  the  contained  fluid  mfiO  is  the  heart")  and  assumptions  that 
result  from  reasoning  with  the  infonnation  in  ID)  and  B.  Propositional  beliefs  in  D  are  believable 
(but  not  necessarily  believed)  independently  of  B,  but  beliefs  in  B  use  ID)  as  a  foundation  for 
inference  and  assumption.  This  means  that  B  contains  the  scenario  models  produced  by  model 
formulation. 

The  rationale  for  each  inference  and  assumption  in  B  is  recorded  using  the  justification 
structure  produced  via  model  formulation.  We  defined  justifications  in  our  discussion  of  truth 
maintenance  systems  in  Chapter  3,  but  note  that  our  justifications  have  multiple  consequences.22 
The  justification  structure  is  recorded  as  propositional  statements  in  a  justification  microtheory. 
For  instance,  the  rightmost  justification  in  Figure  1 1  is  described  by  the  following  statements: 

(isa  jl  Justification) 

~2  A  justification  with  multiple  consequences  can  be  converted  into  a  set  of  multiple  justifications  -  one  for  each 
consequence  -  by  creating  a  single-consequence  justification  with  the  same  set  of  antecedents  for  each  consequence. 
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(antecedentsOf  jl  (greaterThan  (Amount  blood  heart)  zero)) 
(antecedentsOf  jl  (isa  mfiO  ContainedFluid) ) 

(consequencesOf  jl  (active  mfiO)) 

(consequencesOf  jl  (qprop-  (Pressure  mfiO)  (Volume  heart))) 


The  justifications  produced  by  model  fonnulation  are  used  to  reify  explanations  and 
construct  explanation  microtheories.  Each  well-founded  explanation  in  the  justification  structure 
corresponds  to  a  different  explanation,  and  the  beliefs  in  each  well-founded  explanation  are 
stored  in  separate  explanation  micro  theories. 

The  final  microtheory  of  note  is  the  adopted  domain  knowledge  microtheory  Da.  This  is  the 
subset  of  ID)  that  the  agent  presently  accepts  as  true.  This  does  not  mean  that  the  agent  explicitly 
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Figure  12:  The  relationship  between  microtheories  (MTs)  in  our  computational 
model.  Solid  arrows  represent  “inherit  all  information  from”  (i.e.,  child-of),  and 
dotted  arrows  represent  “contains  some  information  from.” 

regards  the  beliefs  in  D  that  are  not  present  in  Da  (which  we  write  D/Da)  as  false;  rather,  the 


agent  may  be  undecided  on  the  truth  value  of  these  beliefs.  Like  D,  Da  is  not  necessarily 


internally  consistent.  If  Da  is  inconsistent,  nothing  is  broken  -  we  can  simply  say  that  the  agent 
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holds  beliefs  to  be  true  that  are  logically  inconsistent.  Oa  will  become  important  later  in  this 
chapter,  during  our  discussion  of  belief  revision  using  cost  functions. 

The  relationships  between  different  microtheories  and  microtheory  types  that  we  have 
discussed  are  shown  in  Figure  12.  The  contexts  D  and  IB  are  collector  microtheories  of  scenarios 
and  scenario  models,  respectively.  Explanation  microtheories  contain  subsets  of  information 
from  D  and  B  that  collectively  participate  in  a  well-founded  explanation.  Finally,  Ba  contains 
the  subset  of  the  infonnation  from  ID)  which  is  presently  believed  by  the  agent. 

The  remainder  of  our  discussion  of  our  computational  model  relies  on  this  information 
organization  scheme.  We  next  describe  how  explanations,  justifications,  and  beliefs  are  related. 
For  quick  reference,  condensed  definitions  of  the  above  microtheories  and  of  other  terms  used 
later  in  this  chapter  are  included  in  a  table  in  the  appendix. 
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(a) 


(b) 


Legend 


fo  (isa  heart  Heart) 

fl  (physicallyContains  heart  Blood) 

f2  (isa  Blood  StuffType) 

f 3  (isa  body  WholeBody) 

f4  (physicallyContains  body  Blood) 

mfo  (isa  ContainedFluid  Model Fragment) 

(greaterThan  (Amount  Blood  heart)  0) 
fa  (isa  mfiO  ContainedFluid) 

fj  (substanceOf  mfiO  Blood) 

fg  (containerOf  mfiO  heart) 

mf,  (isa  FluidFlow  Mode 1 Fragment ) 


fl5  (isa  mfi2  FluidFlow) 

fl6  (f romLocation  mfi2  mfiO) 

fy  (toLocation  mfi2  mfil) 

f22  (describes  mfi2  naiveH2B) 

f23  (isa  naiveH2B  PhysicalTransfer) 

f24  (substanceOf  naiveH2B  Blood) 

f25  (outOf -Container  naiveH2B  heart) 

f26  (into-Container  naiveH2B  body) 

hi  (isa  1-heart  (LeftRegionFn  Heart) ) 

f$2  (physicallyContains  1-heart  Blood) 


Figure  13:  A  portion  of  an  explanation-based  network,  (a)  Single  explanation  xo  for  an 
explanandum  naiveH2B  (rightmost  nodes),  (b)  After  new  knowledge  is  added, 
preferences  are  computed  for  new  knowledge  (<c),  new  model  fragment  instances  (<mti), 

and  for  the  new  explanation  xi  (<xp). 


4.3  An  explanation-based  network  for  conceptual  change 


Explanations,  justifications,  and  beliefs  can  be  viewed  as  a  network  that  supports  metareasoning 
and  conceptual  change.  This  is  an  extension  of  a  justification  structure  network  (e.g.,  Figure  1 1). 
A  portion  of  a  network  is  shown  in  Figure  13,  before  (Figure  13a)  and  after  (Figure  13b)  for  the 
circulatory  system  micro-example  outlined  above.  The  legend  of  Figure  13  labels  the  key  beliefs 
and  model  fragments  for  reference,  but  the  specific  beliefs  are  not  yet  important.  We  describe 
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the  network  with  respect  to  this  example.  To  improve  readability,  we  lay  out  the  network  on 
three  tiers.  We  describe  them  from  bottom  to  top. 

Bottom  (domain  knowledge)  tier 

The  bottom  tier  of  the  network  in  Figure  13(a-b)  is  the  domain  knowledge  tier,  and  contains 
information  from  D.  This  includes  propositional  beliefs,  specifications  of  quantities,  and  model 
fragments.  The  bottom  tier  of  Figure  13(a-b)  contains  the  subset  of  HJ  that  is  relevant  to  the 
circulatory  system  micro-example.  All  propositional  beliefs  on  this  tier  are  supported  by 
observation  or  instruction. 

Middle  (justification)  tier 

The  middle  tier  plots  provisional  beliefs  from  B  (represented  as  circles  in  Figure  13)  and 
justifications  (represented  as  triangles  in  Figure  13).  As  in  Figure  11,  the  antecedents  of  a 
justification  are  on  its  left,  and  its  consequences  are  on  its  right.  The  provisional  beliefs  and 
justifications  in  Figure  13(a-b)  are  the  subsets  that  are  relevant  to  the  circulatory  system  micro¬ 
example.  All  of  the  justifications  in  the  system  are  plotted  on  this  tier.  Unlike  the  bottom  tier, 
the  belief  nodes  on  this  tier  are  not  supported  by  observation  or  instruction  -  they  are  inferred 
during  the  explanation  construction  process,  which  we  describe  in  section  4.4. 

Top  (explanation)  tier 


The  top  tier  plots  explanation  nodes.  Figure  13(a-b)  depicts  a  subset  of  all  explanations  X 
constructed  by  the  agent,  plotted  with  quadrilateral  nodes  xq  and  xj  on  the  top  tier.  Each 
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explanation  represents  a  well-founded  explanation  for  some  situation  or  belief.  Each  explanation 
is  uniquely  defined  as  x  =  (/,  B ,  M),  where 

•  M  is  set  of  one  or  more  explanandums  M  that  are  explained  by  x. 

•  /is  a  set  of  justifications  /  that  comprise  a  well-founded  explanation  (defined  in  Chapter 
3)  for  M.  In  Figure  13,  each  explanation  node  has  dashed  lines  to  its  justifications  /. 

•  B  is  the  set  of  all  beliefs  that  comprise  the  explanation.  B  includes  all  antecedents  and 
consequences  of  the  explanation’s  justifications  /.  This  includes  domain  knowledge 
from  ID)  and  provisional  beliefs  from  E,  so  B  Q  ID)  U  IB.  The  explanation’s  microtheory 
contains  all  beliefs  in  B. 

Based  on  these  definitions,  the  network  in  Figure  13(a-b)  tells  us  a  lot  about  the  agent’s 
learning  in  the  circulatory  system  micro-example.  Before  encountering  the  textbook  infonnation 
(Figure  13a),  the  agent  justifies  the  flow  of  blood  to  the  body  naiveH2B  with  an  explanation  xq 
that  involves  a  FluidFlow  process  and  two  ContainedFluid  instances:  one  for  the  heart  and 
one  for  the  rest  of  the  body.  There  are  no  other  explanations  for  this  phenomenon.  After  the 
textbook  scenario  is  incorporated  (Figure  13b),  the  agent  has  infonnation  in  ID)  about  the  left 
heart  (1-heart)  and  the  flow  of  blood  from  the  left  heart  to  the  body  (lef  tH2B).  Figure  13b 
also  contains  a  new,  second  explanation  xj  which  uses  new  and  old  infonnation  in  ID)  (the  bottom 
their)  and  E  (the  middle  tier).  The  new  explanation  xi  justifies  the  old  (naiveH2B)  and  new 


(leftH2B)  situations,  but  note  that  the  previous  explanation  xo  and  its  constituent  justifications 
and  beliefs  still  exist.  These  explanations  are  now  in  competition.  In  the  following,  we  discuss 
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how  explanations  are  constructed,  how  they  compete,  how  they  are  reused,  and  how 
competitions  are  resolved  to  achieve  conceptual  change. 

4.4  Constructing  explanations 

Our  computational  model  constructs  explanations  for  an  explanandum  m  in  two  steps:  (1) 
perform  abductive  model  formulation  to  create  one  or  more  scenario  models  that  justify  in ;  (2) 
for  each  well-founded  explanation  of  m  within  the  resulting  justification  structure,  create  an 
explanation  node  in  the  network.  Since  computing  well-founded  explanations  is  described  in 
Chapter  3,  we  concentrate  here  on  our  abductive  model  fonnulation  algorithm  which  is  a 
contribution  of  this  research. 

As  stated  above,  compositional  model  fragments  simulate  parts  of  mental  models.  Figure  14 
shows  two  model  fragments:  ContainedFluid  and  FluidFlow.  Figure  13  contains  the  belief 
nodes  mfo  (isa  ContainedFluid  ModelFragment)  and  mfi  (isa  FluidFlow 
Model  Fragment)  which  are  used  to  explain  blood  flowing  from  the  heart  (xq)  and  the  left-heart 
(. xi )  to  the  body.  We  use  these  explanations  as  examples  for  our  description  of  explanation 


construction. 
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ModelFragment  ContainedFluid 
Participants : 

?con  Container  (containerOf ) 

?sub  StuffType  (substanceOf ) 

Constraints : 

(physicallyContains  ?con  ?sub) 

Conditions : 

(greaterThan  (Amount  ?sub  ?con)  Zero) 

Consequences : 

(qprop-  (Pressure  ?self)  (Volume  ?con) ) 

ModelFragment  FluidFlow 
Participants  : 

?source-con  Container  (outOf-Container ) 

?sink-con  Container  ( into-Container ) 

?source  ContainedFluid  ( f romLocation) 

?sink  ContainedFluid  (toLocation) 

?path  Path-Generic  (along-Path) 

?sub  StuffType  (substanceOf) 

Constraints : 

(substanceOf  ?source  ?sub) 

(substanceOf  ?sink  ?sub) 

(containerOf  ?source  ?source-con) 

(containerOf  ?sink  ?sink-con) 

(permitsFlow  ?path  ?sub 

?source-con  ?sink-con) 

Conditions : 

(unobstructedPath  ?path) 

(greaterThan  (Pressure  ?source) 

(Pressure  ?sink) ) ) 

Consequences : 

(greaterThan  (Rate  ?self)  Zero) 

(i-  (Volume  ?source)  (Rate  ?self)  ) 

(i+  (Volume  ?sink)  (Rate  ?self)  ) 

Figure  14:  ContainedFluid  (above)  and  FluidFlow  (below)  model  fragments  used  in 
the  simulation  in  Chapter  7.  English  interpretations  of  each  model  fragment  (at  right). 

Before  describing  our  algorithm,  it  is  important  to  note  that  “abductive  model  formulation” 

is  not  synonymous  with  “abduction.”  Abduction  computes  the  set  of  assumptions  (or 

hypotheses)  that  best  explains  a  set  of  observations.  In  contrast,  abductive  model  formulation 

computes  qualitative  models  of  a  phenomenon  by  assuming  the  existence  of  entities  and  relations 

between  them.  If  we  later  compare  these  qualitative  models  to  compute  the  best  explanation, 

then  we  have  perfonned  a  nontraditional  type  of  abduction.  This  section  discusses  the 

construction  of  the  qualitative  models,  and  we  discuss  the  comparison  of  qualitative  models  later 

in  this  chapter. 


When  a  container  con  physically  contains 
a  type  of  substance  sub,  a  contained  fluid 
exists.  When  there  is  a  positive  amount  of 
sub  in  con,  the  volume  of  con  negatively 
influences  the  pressure  of  this  contained 
fluid. 


When  two  contained  fluids  -  a  source  and 
a  sink  -  are  connected  by  a  path,  and  both 
are  of  the  same  type  of  substance,  a  fluid 
flow  exists.  When  the  path  is 
unobstructed  and  the  pressure  of  source  is 
greater  than  the  pressure  of  sink,  the  rate 
of  the  flow  is  positive  and  it  decreases  the 
volume  of  source  and  increases  the 
volume  of  sink. 
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Our  abductive  model  formulation  algorithm  starts  with  the  procedure  justify-explanandum, 
shown  in  Figure  15,  which  is  given  three  items  as  input: 

1 .  A  domain  context  D  which  is  a  microtheory  that  contains  a  subset  of  the  model  fragments 
in  D. 

2.  A  scenario  context  S  which  is  a  microtheory  that  contains  propositional  beliefs  (i.e.,  no 
model  fragments).  S  contains  a  subset  of  domain  knowledge  in  ID),  since  HJ  inherits  from 
scenario  microtheories  which  are  necessary  for  model  formulation.  S  also  contains 
provisional  beliefs  from  B  (from  previous  model  fonnulation  attempts)  to  reuse  previous 
solutions.  For  example,  if  the  agent  has  previously  determined  that  there  is  a 
ContainedFluid  instance  within  the  heart,  it  need  not  reconstruct  this. 

3.  An  explanandum  m  that  requires  explanation.  Our  algorithm  takes  in  two  different  types 
of  explanandums:  (1)  propositional  beliefs;  and  (2)  entities  that  describe  processes,  e.g., 
naiveH2B  which  describes  the  transfer  of  blood  from  heart  to  body.  When  an 
explanandum  is  a  belief,  the  algorithm  directly  justifies  the  belief,  and  when  the 
explanandum  is  a  process  entity,  the  algorithm  instantiates  models  that  describe  the 
entity.  For  our  example,  we  will  use  the  process  entity  naiveH2B  as  the  explanandum, 
which  is  described  by  facts  f 23-26  in  Figure  13. 

Arguments  S  and  D  can  be  constructed  from  one  or  more  explanations.  For  instance,  using  a 
set  of  explanations  {(Jo,  B0,  Mo),  ...,  (Jn,  Bn,  M„)),  we  can  construct  S  as  a  microtheory  that 
contains  all  beliefs  in  the  belief  sets  {B0,  Bn}  of  the  explanations  and  we  can  construct  D  as 
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the  set  of  all  model  fragments  instantiated  in  these  belief  sets.  We  discuss  this  further  in  section 
4.5  below. 

When  the  explanandum  provided  to  justify-explanandum  is  a  process  instance,  the 
procedure  justify-process  does  the  rest  of  the  work.  Otherwise,  when  the  explanandum  is  a 
proposition  describing  a  quantity  change  or  an  ordinal  relationship,  the  procedures  justify- 
quantity-change  and  justify-ordinal-relation,  respectively,  do  the  rest  of  the  work.  To  be  sure, 
there  are  other  types  of  propositions  that  can  be  justified,  but  since  our  simulations  involve 
explaining  processes  and  state  changes,  these  explanandums  and  procedures  are  sufficient  for  the 
simulations  in  this  thesis. 


Ill 


Front-ends  to  abductive  model  formulation 


procedure  justify-explananditm(explanandum  m,  domain  D,  scenario  S) 

if  777  is  a  symbol  and  m  is  an  instance  of  collection  c  such  that  (isa  c  ModelFragment) : 
justify-processim,  D,  S) 

else  if  772  unifies  with  (greaterThan  ?x  ?y)  : 

justify-ordinaljm,  D,  S) 

else  if  772  unifies  with  (increasing  ?x)  or  with  (decreasing  ?x): 

let  q,d  =  quantity-of-change(772),  direction-of-change(772) 
justify-quantity-changeiq,  d,  D,  S) 

procedure  justify-process  (process  instance  m,  domain  D,  scenario  5) 

//  Find  collections  of  the  given  entity  within  D 
let  C  =  query  D  for  ?x:  (isa  m  ?x) 

//  Find  model  fragments  in  D  that  are  specializations  of  these  collections. 

let  F  =  query  D  for  ?x:  c  G  C  A  (isa  ?x  ModelFragment)  A  (genls  ?x  c ) 

for  each /in  F: 

II  Find  participant  roles  {( slot0 ,  role0), ... ,  (slotn,  rolen )}  of f 
let  P  =  participant-roles-ofif) 

II  Find  entities  in  S  that  fill  participant  roles  of  a /instance  describing  m 
let  R  =  query  S  for  (slot,  ?  x)  :  (slot,  role)  e  P  A  ( role  m  ?x) 
abductive  -mf-instantiation(f,  R,  D) 

procedure  justify-ordinal-relation  (ordinal  relation  m,  domain  D,  scenario  S) 

//  772  is  of  the  form  (greaterThan  (MeasurementOf  q  si)  (MeasurementOf  q  s  2 )  ) 

let  q,  su  s 2  =  quantity-of(772),  state-l-of(77j),  state-2-of(77j) 
if  query  5for  (after  s2  sp  then: 

justify-quantity-changeiq,  i -,D,S) 
if  query  5for  (after  s,  s2)  then: 

justify-quantity-change(q,  i +,D,S) 

procedure  justify-quantity-change  (quantity  q,  direction  d,  domain  D,  scenario  5) 

//  Find  direct  and  indirect  influences  of  q 

instantiate-fragments-with-consequencei  (gprop  q  ?x) ,  D,  S) 
instantiate-fragments-with-consequencei(qprop-  q  ?x) ,  D,  S) 
instantiate-fragments-with-consequencei  (d  q  ?x) ,  D,  S) 

let  7,  =  query  S  for  indirect  influences  on  q.  II  results  are  in  form  (qprop/qprop-  q  ?x) 

for  each  i  in  /,: 

let  di  =  direction-of-influence(2)  //  qprop  or  qprop- 
let  q i  =  influencing-quantity(2) 

let  dc  =  d 

if  di  =  qprop- then: 

set  dc  =  opposite(i/) 
justify-quantity-changeiq h  dc,  D,  S) 

Figure  15:  Pseudo-code  for  front-end  procedures  that  trigger  abductive  model  formulation. 

Regardless  the  type  of  explanandum  that  is  being  explained,  all  paths  through  justify- 
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explanandum  call  the  procedure  abductive-mf-instantiation.  This  procedure  takes  a  model 
fragment  m  (e.g.,  FluidFlow),  a  role  binding  list  R  that  associates  zero  or  more  participant  slots 
of  the  model  fragment  with  known  entities  (e.g.,  {(?sub,  Blood),  (?source-con,  heart), 
(?sink-con,  body)}),  and  the  D  and  S  arguments  from  justify-explanandum.  It  instantiates 
and  activates  all  possible  instances  of  m  that  conform  to  the  role  binding  list  R  provided  the 
scenario  information  S.  Importantly,  if  it  cannot  bind  some  participant  slot  to  an  entity  within  S, 
it  will  assume  the  existence  of  an  entity  that  satisfies  this  role,  and  it  will  assume  propositions 
(i.e.,  constraints  and  conditions)  as  necessary.  For  example,  if  there  is  no  known  Path- 
Generic  instances  that  satisfies  the  constraints  for  ?path  participant  of  FluidFlow,  the 
algorithm  will  assume  the  existence  of  such  an  entity. 

We  begin  by  stepping  through  an  example  of  explanation  construction  that  uses  justify- 
process.  The  behaviors  of  the  justify-quantity-change  and  justify -ordinal-relation  procedures 
are  discussed  in  Chapter  6.  We  use  the  explanation  of  situation  naiveH2B  in  Figure  13  as  an 
example. 
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(a) 


(b) 


fo  (isa  heart  Heart) 

fl  (physicallyContains  heart  Blood) 

f2  (isa  Blood  StuffType) 

fs  (isa  body  WholeBody) 

f4  (physicallyContains  body  Blood) 

mfo  (isa  ContainedFluid  Model Fragment ) 

(greaterThan  (Amount  Blood  heart)  0) 
fa  (isa  mfiO  ContainedFluid) 

fy  (substanceOf  mfiO  Blood) 

ft  (containerOf  mfiO  heart) 

mf,  (isa  FluidFlow  ModelFragment) 


fl3  (isa  (SkolemFn  mfi2  ...)  Path-Generic) 
fl4  (permitsFlow  (SkolemFn  mfi2  ...)  ...) 
fl5  (isa  mfi2  FluidFlow) 

fl6  (f romLocation  mfi2  mfiO) 

fjy  (toLocation  mfi2  mfil) 

f22  (describes  mfi2  naiveH2B) 

f23  (isa  naiveH2B  PhysicalTransf er) 

f24  (substanceOf  naiveH2B  Blood) 

f25  (outOf -Container  naiveH2B  heart) 

f26  (into-Container  naiveH2B  body) 


Figure  16:  A  portion  of  explanation-based  network,  (a)  Before  an  explanation  has  been 
constructed  for  naiveH2B.  (b)  After  an  explanation  xo  has  been  constructed  for 
naiveH2B  via  abductive  model  formulation. 

Suppose  that  the  agent’s  knowledge  is  in  the  state  depicted  in  Figure  16(a):  the  agent 
believes,  due  to  PhysicalTransfer  instance  naiveH2B,  that  there  is  a  transfer  of  blood  from 
the  heart  to  the  body.  However,  there  is  no  knowledge  of  a  path  or  specific  process  by  which 
this  occurs.  When  the  agent  explains  naiveH2B  with  the  call  justify-explanandum( naiveH2B, 
D,  S),  the  procedure  first  determines  whether  naiveH2B  can  be  justified  by  a  model  fragment. 


Since  naiveH2B  is  a  PhysicalTransfer,  the  system  will  check  whether  there  are  model 
fragments  that  can  model  a  PhysicalTransf  er.  Suppose  the  belief  (genls  FluidFlow 
PhysicalTransfer)  is  present  in D,  indicating  that  this  is  indeed  the  case. 
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The  next  step  is  to  find  properties  of  naiveH2B  that  are  important  for  modeling  it  as  a 
FluidFlow.  Consider  the  following  participant  roles  of  FluidFlow  from  Figure  14: 

?source-con  Container  (outOf-Container ) 

?sink-con  Container  (into-Container) 

?source  ContainedFluid  (f romLocation) 

?sink  ContainedFluid  (toLocation) 

?path  Path-Generic  (along-Path) 

?sub  StuffType  ( substanceOf ) 

The  procedure  must  next  search  for  participants  for  each  of  the  following  slots:  {?  source- 
con,  ?sink-con,  Psource,  ?sink,  ?path,  ?sub}.  If  it  cannot  find  a  participant  in  the 
scenario,  it  must  either  instantiate  a  model  to  fill  the  role  or  assume  the  existence  of  the 
participant.  We  discuss  each  of  these  cases.  First,  some  of  these  participants  can  be  found  in  S. 
For  example,  the  participants  ?source-con,  ?sink-con,  and  ?sub  correspond  to  the  roles 
outOf-Container,  into-Container,  and  substanceOf,  respectively.  The  procedure 
queries  S  to  determine  which  entities  (if  any)  fill  these  roles  of  naiveH2B: 

(outOf-UnderSpecif iedContainer  naiveH2B  ?source-con) 
(into-UnderSpecif iedContainer  naiveH2B  ?sink-con) 

(substanceOf  naiveH2B  ?sub) 

( f romLocation  naiveH2B  Psource) 

(toLocation  naiveH2B  Psink) 


(along-Path  naiveH2B  Ppath) 
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Not  all  of  this  information  is  present  in  S,  but  some  infonnation  about  naiveH2B  is 
represented  as  fi4-26  in  Figure  16: 

(outOf-UnderSpecif iedContainer  naiveH2B  heart) 

(into-UnderSpecif iedContainer  naiveH2B  body) 

(substanceOf  naiveH2B  blood) 

From  these  assertions,  the  procedure  constructs  the  binding  list  R  =  {(?source-con, 
heart),  (?sink-con,  body),  (?sub,  Blood)}  to  bind  the  participant  variables  to  ground  (i.e., 
non-variable)  entities  in  S.  More  work  must  be  done:  the  three  remaining  participant  slots  (i.e., 
?source,  ?sink,  and  ?path)  must  be  bound  and  constraints  must  be  tested  in  order  to  explain 
naiveH2B  with  a  FluidFlow  instance.  This  is  handled  by  calling  abductive-mf- 
instantiation (FluidFlo w ,  R,  S,  D)  in  Figure  16. 

Abductive  instantiation  of  FluidFlow  with  partial  bindings  R  begins  by  finding  participants 
that  are  themselves  model  fragments.  This  includes  ?source  and  ?sink,  both  of  which  are 
ContainedFluid  instances.  The  procedure  finds  constraints  on  these  ContainedFluid 
instances  by  substituting  the  bindings  R  =  {(?source-con,  heart),  (?sink~con,  body), 
(?sub,  Blood)}  into  the  FluidFlow  constraints.  This  substitution  produces  the  following  set 
of  statements: 

(substanceOf  ?source  Blood) 

(substanceOf  ?sink  Blood) 


(containerOf  ?source  heart) 
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(containerOf  ?sink  body) 

(permitsFlow  ?path  Blood  heart  body) 

As  shown  in  Figure  14,  these  statements  contain  two  of  the  participant  roles  (substanceOf 
and  containerOf)  for  participant  slots  (?sub  and  ?con,  respectively)  of  ContainedFluid, 
so  the  system  makes  the  two  recursive  procedure  calls: 

abductive-mf -instantiation (ContainedFluid,  R  =  {(?sub,  Blood),  (?con,  heart)},  S,  D ) 
abducti ve-mf-instantiation (ContainedFluid,  R  =  {(?sub,  Blood),  (?con,  body)},  S,  D ) 
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Abductive  model  formulation 


procedure  instantiate-fragments-with-consequence  (proposition  p,  domain  D,  scenario  5) 
let  F  =  query  D  for  model  fragments  with  some  consequence  that  unifies  with  p 
for  each /in  F: 

for  each  consequence  c  off  that  unifies  with  p\ 
let  B  =  bindings-between(c,  p) 
abductive-mf-instantiation(f,  B,  D,  S) 

procedure  abductive-mf-instantiation  (model  fragment  m,  role  bindings  R,  domain  D,  scenario  5) 
//  Find  participant  collections  {( slot0 ,  coll0), ... ,  ( slotn ,  colln)}  of  m. 
let  Cm  =  participant-collections-of(/j?) 

//  Find  the  constraints  of  in. 
let  Sm  =  constraints-of(m) 

//  Replace  variable  slots  with  known  entities  in  constraints  &  participants 

set  Sm  =  replace  slot  with  ent  in  Sm  for  every  (slot,  ent)  e  R 

set  C,„  =  replace  slot  with  ent  in  Cm  for  every  (slot,  ent)  e  R 

II  If  a  participant  is  a  model  fragment,  instantiate  it  recursively. 

let  F  =  {(slot,  coll)  e  Cm:  query/)  for  (isa  coll  ModelFragment) } 

for  each  (slot,  coll)  in  F: 

II  Using  the  local  constraints  Sm,  find  participant  bindings  for  the  recursive  call. 

let  Sf=  ground  statements  in  Sm  which: 

1.  have  a  participant  role  of  coll  as  its  predicate  and 

2.  have  slot  as  a  first  argument. 

let  Rf=  bindings  between  participant  slots  of  coll  and  corresponding  entities  in  Sf 

II  Make  a  recursive  call  to  instantiate  the  participant. 
abductive-mf-instantiation(coll,  //  D) 

II  Find  all  instance  bindings  of  m  in  D,  including  ones  missing  participants 
let  Instances  =  query  D  for  bindings  of  Sm  A  Cm 
for  each  I  in  Instances : 

//  Assume  the  existence  of  all  unknown  participants, 
let  UnkParticipants  =  {(slot,  ent,  coll)  e  /:  variable(ent)} 
for  each  (slot,  ent,  coll)  in  UnkParticipants : 
let  e  =  n e w- s k o I e m - e n  ti  ty ( e ,  coll) 
set  /  =  replace  (slot,  ent,  coll)  with  (slot,  e,  coll)  in  / 

//  Add  the  constraints,  conditions,  consequences,  and  participant  roles  to  IB, 

//  and  create  justifications  for  this  model  fragment’s  instantiation  and  activation. 
instantiate-model-fragment(/77, 1) 


Figure  17:  Pseudo-code  for  abductive  model  instantiation 

These  recursive  invocations  find  no  model  fragments  that  can  be  participants  (?sub  or 

?con)  of  ContainedFluid.  The  procedure  finds  all  possible  instances  of  ContainedFluid 
using  the  bindings  R  that  obey  the  constraints  (e.g.,  (physicallyContains  heart  Blood)) 
and  participant  types  (e.g.,  (isa  Blood  Stuf  f  Type)  )  in  S  and  instantiates  them.  Both 
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recursive  invocations  instantiate  a  single  ContainedFluid  instance:  one  for  heart  and  one  for 
body.  The  following  assertions  are  added  to  S  and  to  the  provisional  belief  microtheory  E: 

(isa  mfiO  ContainedFluid) 

(substanceOf  mfiO  Blood) 

(containerOf  mfiO  heart) 


(isa  mfil  ContainedFluid) 

(substanceOf  mfil  Blood) 

(containerOf  mfil  body) 

These  beliefs  are  plotted  as and  f9.11  in  Figure  16.  Execution  returns  to  the  top-level  call 
to  abductive-mf-instantiation,  where  the  procedure  queries  for  remaining  FluidFlow 
participants.  Based  on  the  information  in  S  -  including  the  model  fragment  instances  that  have 
just  been  added  -  the  procedure  can  bind  more  of  the  FluidFlow  participants:  {(?source-con, 
heart),  (?sink-con,  body),  (?sub,  Blood),  (?source,  mf  0),  (?sink,  mf  l),  (?path, 
?path)}.  Note  that  it  is  still  incomplete  since  there  is  no  entity  from  the  scenario  that  binds  to 
the  ?path  entity.  This  is  because  there  is  no  entity  in  S  is  a  Path-Generic  and  that  satisfies 
the  FluidFlow  constraint  (permitsFlow  ?path  Blood  heart  body).  In  this  case,  the 
model  fragment  is  still  instantiated.  A  new  symbol  (e.g.,  mf  12)  is  created  for  the  model 
fragment  instance,  and  unbound  entities  such  as  ?path  are  assumed  and  represented  with  a 
skolem  term  such  as  (SkolemParticipant  mfi2  along-Path)  .  This  skolem  term 
indicates  that  this  entity  was  assumed  as  a  participant  of  mf  i2  for  the  role  along-Path.  The 
following  two  assertions  are  added  to  S  and  to  E: 
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(isa  (SkolemParticipant  mfi2  along-Path)  Path-Generic) 

(permitsFlow  (SkolemParticipant  mfi2  along-Path)  Blood  heart  body) 

These  beliefs  are  plotted  as  fo-%  in  Figure  16(b).  Notice  that  this  entity  is  described  only  in 
the  middle  (provisional  belief  B)  layer,  since  it  was  generated  from  model  formulation  and  not 
from  a  scenario  (i.e.,  observation,  interaction,  or  instruction).  It  can  be  used  like  any  other  entity 
and  may  be  a  participant  of  model  fragment  instances  in  subsequent  calls. 

Once  the  procedure  has  a  complete  list  of  ground  participants,  it  creates  a  single  instance 
mf  i2  of  type  FluidFlow  and  uses  this  instance  to  justify  the  explanandum  naiveH2B.  In  other 
cases,  there  may  be  multiple  instances  that  justify  the  explanandum  -  consider,  for  instance,  that 
the  agent  knew  about  two  Path-Generic  instances  that  permit  flow  of  Blood  from  heart  to 
body.  In  this  case,  the  agent  would  not  have  assumed  a  ?path  participant  but  would  instead 
create  a  FluidFlow  instance  for  each  path.  The  instance  mf  i2  is  described  with  the  following 
statements: 

(outOf-UnderSpecif iedContainer  mfi2  heart) 

(into-UnderSpecif iedContainer  mfi2  body) 

(substanceOf  mfi2  Blood) 

( f romLocation  mfi2  mfiO) 

(toLocation  mfi2  mfil) 


(along-Path  mfi2  (SkolemParticipant  mfi2  along-Path) ) 
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All  model  fragment  instantiations  and  model  fragment  activations  are  stored  as 
justifications,  and  the  associated  beliefs  are  stored  in  IB.  This  comprises  the  entire  middle 
(justification  structure)  tier  in  Figure  16(b),  which  contains  in  a  single  well-founded  explanation 
for  the  explanandum  naiveH2B.  This  well-founded  explanation  has  been  reified  as  an 
explanation  node  xo  plotted  in  the  top  tier  of  Figure  16(b).  This  is  the  product  of  the  explanation 
construction  algorithm. 

Now  suppose  that  the  agent  learns  additional  details:  (1)  the  heart  is  divided  into  left  and 
right  sides  (1-heart  and  r-heart,  respectively)  and  (2)  there  is  a  transfer  leftH2B  of  blood 
form  1-heart  to  body.  The  agent  can  construct  an  explanation  for  leftH2B  analogous  to  the 
process  for  naiveH2B.  A  new  FluidFlow  instance  must  be  created  for  leftH2B,  but  the 
ContainedFluid  instance  for  body  can  be  reused  as  its  ?sink  participant.  After  explaining 
let  tH2B,  the  network  will  resemble  Figure  18.  There  are  three  important  items  of  note  in 


Figure  18:  The  network  after  two  explanations  have  been  constructed  via  abductive 
model  formulation:  x0  explains  naiveH2B,  andx^  explains  naiveH2B  and  leftH2B. 

Figure  18,  which  we  will  discuss  in  later  sections  of  this  chapter:  (1)  the  new  explanation  xj 
explains  leftH2B  and  also  naiveH2B  (since  the  1-heart  is  a  more  specific  region  of  heart),  so  xo 
and  X/  are  in  competition;  (2)  xq  and  xj  use  different  but  overlapping  sets  of  beliefs  and 
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justifications;  and  (3)  preferences  (represented  as  arrows)  have  been  encoded  between  concepts 
<c,  justifications  and  explanations  <xp.  Constructing  a  new  explanation  does  not  eliminate 
previous  explanations;  rather,  it  uses  the  product  of  previous  explanations  to  build  new 
structures. 

Our  abductive  model  formulation  algorithm  is  exhaustive  and  complete  relative  to  the 
scenario  S,  domain  theory  D,  and  explanandum  m .  It  is  incomplete  with  respect  to  S  and  D 
alone,  since  m  guides  the  recursive  search  for  model  fragment  instances.  For  example,  the 
beliefs  (isa  lvr  Liver)  and  (physicallyContains  lvr  Blood)  might  have  been  in  the 
scenario  S,  but  a  corresponding  ContainedFluid  would  not  have  been  instantiated  over 
{(?sub,  Blood),  (?con,  lvr)}  because  the  explanandum  naiveH2B  constrained  the  source  and 
sink  containers  to  heart  and  body,  respectively. 

The  abductive  model  formulation  algorithm  results  in  ( m+ef  model  instantiation  attempts  in 
the  worst  case,  where  m  is  the  number  of  models  in  the  domain  theory,  e  is  the  number  of  entities 
in  the  scenario,  and  p  is  the  number  of  participant  slots  per  model.  The  algorithm  is  guaranteed 
to  converge,  assuming  that  there  is  no  cycle  in  model  fragment  dependency.  Figure  19  illustrates 
the  dependency  graph  for  the  above  abductive  model  formulation  example  for  naiveH2B.  Each 


Figure  19:  A  graph  of  the  relationships  between  model  fragments  and  other  collections 

in  the  circulatory  system  example. 
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box  is  a  model  fragment  (bold-bordered)  or  ordinary  collections  (dashed)  and  edges  represent 
participant  relationships  between  types.  For  instance,  the  ? source  participant  slot  of 
FluidFlow  requires  a  ContainedFluid,  and  the  ?con  participant  slot  of  a  ContainedFluid 
requires  a  Container.  Each  edge  between  two  model  fragments  represents  a  single  recursive 
invocation,  so  -  in  the  example  above  -  there  are  two  recursive  invocations  for  FluidFlow:  one 
for  ?source  and  one  for  ?sink.  The  algorithm  is  guaranteed  to  tenninate  if  it  satisfies  two 
constraints: 

1 .  There  is  no  path  in  the  graph  from  a  model  fragment  m  to  a  type  t  such  that  t  is  equal  to  m 
or  is  a  superordinate  of  m  in  the  genls  hierarchy  (see  Chapter  2  for  the  definition  of 
genls  within  an  ontological  hierarchy). 

2.  Each  model  fragment  has  a  finite  number  of  participant  slots  (i.e.,  it  is  graphable  with  a 
finite  number  of  nodes). 

3.  The  consequences  of  the  model  fragments  do  not  introduce  new  entities  that  are  not 
already  included  as  a  participant. 

To  illustrate  the  necessity  of  the  first  constraint,  consider  what  might  happen  if  Container 
was  (mistakenly)  marked  as  a  genls  (superordinate)  of  FluidFlow  while  explaining  naiveH2B: 

1.  A  call  to  abductive-mf-instantiation  attempts  to  model  a  FluidFlow. 

2.  A  recursive  invocation  of  abductive-mf-instantiation  attempts  to  instantiate  a 


ContainedFluid  to  fill  the  ? source  participant  slot. 
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3.  Since  the  participant  ?con  of  ContainedFluid  can  be  modeled  by  a  FluidFlow  (i.e., 
(genls  FluidFlow  Container )  ),  there  would  be  a  recursive  invocation  of 
abductive-mf-instantiation  to  attempt  to  instantiate  a  FluidFlow.  Return  to  (1). 

The  second  constraint  is  intuitive:  if  there  are  infinite  participants  of  a  model  fragment,  there 
may  be  infinite  recursive  invocations  to  instantiate  these  participants. 

These  are  reasonable  constraints  for  a  domain  theory.  Aside  from  guaranteeing 
convergence,  the  first  constraint  guarantees  that  the  resulting  scenario  model  will  be  well- 
founded,  according  to  Forbus’  (1992)  formal  definition.  We  include  a  preprocessing  step  that 
ensures  that  the  domain  theory  D  satisfies  this  constraint. 

The  explanations  produced  by  the  algorithm  contain  more  detail  than  everyday  verbal 
explanations  (i.e.,  they  decompose  phenomena  into  elementary  concepts  and  causes).  In  this 
dissertation,  explanations  are  constructed  to  promote  learning  and  to  answer  questions  for 
experimental  evaluation,  not  for  inter-agent  communication.  The  problem  of  constructing 
explanations  for  another  agent  is  best  addressed  elsewhere,  since  (1)  communicating  an 
explanation  may  have  task-specific  aspects,  and  (2)  explaining  to  another  person  involves 
knowing  what  she  believes  and  often  including  only  beliefs  and  rationale  that  she  lacks. 

One  problem  we  have  not  yet  addressed  is  the  problem  of  multiple  explanations:  after  well- 
founded  explanations  have  been  reified  as  explanation  nodes  (e.g.,  xq  and  xt  in  Figure  18),  there 
frequently  exist  multiple  explanations  for  a  single  explanandum  (e.g.,  naiveH2B  in  Figure  18). 
Explanation  competition  and  the  resolution  these  competitions  are  topics  of  discussion  later  in 
this  chapter. 
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4.4.1  Psychological  assumptions  of  explanation  construction 

Here  we  discuss  the  psychological  assumptions  underlying  our  abductive  model  fonnulation 
algorithm  that  were  not  addressed  in  Chapter  1 . 

Our  abductive  model  formulation  starts  with  the  explanandum  and  works  backwards  to 
search  over  a  subset  of  the  domain  knowledge.  A  more  complete  model  formulation  algorithm 
would  start  from  all  known  entities  and  work  forward  to  instantiate  and  activate  model 
fragments.  Since  our  model  uses  a  directed  backward  search,  it  assumes  that  people  do  not 
consult  all  of  their  knowledge  when  constructing  explanations.  This  is  supported  by  interview 
transcripts  (e.g.,  in  Sherin  et  ah,  2012)  where  students  must  be  reminded  of  information  they 
have  previously  encountered  before  realizing  their  explanations  are  inconsistent.  In  section  4.5, 
we  discuss  how  similarity-based  retrieval  is  used  to  retrieve  and  reuse  previous  explanations. 
This  further  reduces  the  space  of  domain  knowledge  that  is  searched  during  model  fonnulation. 

Our  algorithm  instantiates  all  possible  models  that  conform  to  an  initial  specification  and 
then  segments  the  resulting  structure  into  multiple  explanations.  This  does  not  seem  to  be  the 
case  for  people;  the  same  students  in  Sherin  et  al.  (2012)  appear  to  construct  a  single  explanation 
incrementally  and  only  consider  an  alternative  explanation  once  their  initial  explanation  proves 
inadequate.  In  Chapter  9,  we  discuss  opportunities  for  making  our  algorithm  more  incremental 
and  interleaving  meta-level  analysis. 

Our  algorithm  terminates  once  an  explanandum  has  been  grounded  in  non-model-fragment 
types,  meaning  that  tennination  rests  solely  on  (1)  what  the  agent  knows  about  the 
scenario/situation,  and  (2)  the  model  fragments  that  are  available.  Consequently,  the  agent  will 
continue  decomposing  causes  and  mechanisms  insofar  as  the  scenario  pennits.  This  is  an 
unlikely  psychological  assumption,  since  it  predicts  that  people  will  take  longer  to  construct 
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mechanism-based  explanations  as  they  accrue  more  detailed  knowledge  about  mechanisms.  We 
might  remove  this  assumption  by  using  modeling  assumptions  (Falkenhainer  and  Forbus,  1991) 
to  limit  the  types  of  model  fragments  considered  -  and  thereby  the  detail  of  the  qualitative  model 
-  based  on  task-  and  domain-level  properties.  Another  possible  solution  is  using  analogy  to  infer 
explanation  structure  from  one  case  to  another.  We  discuss  these  ideas  further  in  Chapter  9. 

4.4.2  Explanation  competition 

We  have  established  that  conceptual  change  involves  entertaining  conflicting  ideas.  In  the  two 
micro-examples  of  conceptual  change  in  this  chapter,  we  see  two  different  examples  of  conflict: 
(1)  between  two  models  of  the  circulatory  system  and  (2)  between  two  different  quantities  that 
represent  force.  We  have  already  described  how  explanations  are  constructed.  This  section 
describes  how  explanations  compete,  and  how  they  are  used  to  organize  infonnation. 

As  shown  above,  there  can  be  multiple  explanations  for  the  same  explanandum  Mh  For 
example,  Figure  18  shows  the  network  with  two  explanations:  (1)  an  explanation  xo  =  (Jo,  B0,  M0 
=  {naiveH2B})  of  naiveH2B  and  (2)  an  explanation  xj  =  (Jj,  B[,  Mi  =  {naiveH2B, 
leftH2B})  of  both  naiveH2B  and  leftH2B.  We  say  that  two  explanations  compete  over  some 
explanandum(s)  M  if  and  only  if  they  both  explain  those  explanandums.  For  example,  xo  and  xj 
compete  to  explain  naiveH2B  since  M0  fl  Mi  =  {naiveH2B}.  By  contrast,  there  is  no 
competition  for  leftH2B  since  xj  is  its  sole  explanation. 

Explanation  competition  is  important  because  it  indicates  a  conflict  between  two  different 
lines  of  reasoning.  In  the  circulatory  system  micro-example,  naiveH2B  is  explained  using 
knowledge  of  the  heart  (via  xo)  and  also  knowledge  of  the  left  heart  (via  xj).  This  is  not  a  serious 
conflict:  one  line  of  reasoning  (xi)  is  just  slightly  more  specific  than  the  other  (xq).  However, 
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there  can  only  be  one  preferred  explanation  per  explanandum.  The  beliefs  in  preferred 
explanations  -  including  their  assumptions,  model  fragment  instances,  and  inferences  -  are 
adopted  by  the  agent,  meaning  that  they  are  believed.  If  x;  becomes  the  preferred  explanation 
for  both  naiveH2B  and  leftH2B  and  xo  is  not  preferred  for  any  explanandum,  then  the  content 
of  xj  will  be  adopted  by  the  agent  and  the  content  exclusive  to  xo  will  not. 

We  can  formalize  the  mapping  from  explanandums  to  their  preferred  explanation  with  an 
explanation  mapping  IE  =  {(m0,  x0), ... ,  (mn,  xn)}  which  maps  each  explanandum  in,  to  its 
preferred  explanation  x,.  The  mapping  IE  is  exhaustive  over  explanandums,  but  not  exhaustive 
over  explanations  (i.e.,  a  single  explanation  may  be  preferred  for  zero  or  more  explanandums). 
We  discuss  how  explanation  preferences  are  computed  later  in  this  chapter. 

The  explanation  mapping  plays  two  important  roles  in  our  model  of  conceptual  change. 

First,  it  determines,  in  part,  what  the  agent  does  and  does  not  believe.  For  any  given  belief,  if  the 
belief  is  in  some  explanation  within  the  explanation  mapping,  the  agent  is  justified  in  believing 
it. 

The  second  role  of  the  explanation  mapping  is  directing  the  search  for  knowledge  when  a 
new  explanandum  must  be  explained.  It  helps  build  the  S  and  D  contexts  for  the  abductive 
model  formulation  algorithm  discussed  above.  This  means  that  the  content  of  preferred 
explanations  -  and  not  their  non-preferred  competitors  -  is  potentially  reused  in  new 
explanations. 

4.5  Explanation  retrieval  and  reuse 


Suppose  that  the  agent  is  asked  to  explain  some  explanandum  m  (e.g.,  how  blood  gets  from  the 
heart  to  the  body)  on  a  questionnaire  and  m  has  already  been  explained  by  the  agent.  How  would 
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the  agent  go  about  explaining  ml  Since  in  has  already  been  explained,  the  explanation  mapping 
IE  already  associates  m  with  its  preferred  explanation x  =  (J ,B,  M)  and  the  processes  and 
assumptions  underlying  m  are  available  in  jc’s  beliefs  B.  This  is  the  simplest  case  of  retrieving 
and  reusing  a  previous  explanation. 

But  how  would  the  agent  explain  m  if  it  had  not  been  previously  encountered?  Before 
constructing  a  new  explanation  using  the  abductive  model  formulation  algorithm,  the  agent  must 
first  define  the  scenario  S  and  domain  theory  D  contexts.  One  simple  solution  is  to  define  D  as 
all  known  model  fragments  in  D  and  define  S  as  all  beliefs  in  ID)  and  B.  This  would  guarantee 
that  the  agent  has  access  to  all  of  the  relevant  infonnation  that  it  has  ever  encountered;  however, 
we  must  also  take  efficiency  into  consideration.  If  we  increase  the  information  in  S  and  D  (e.g., 
by  filling  them  with  all  of  the  agent’s  knowledge)  we  will  potentially  increase  the  number  of 
recursive  calls  during  model  fonnulation,  and  we  will  certainly  increase  search  time. 
Performance  would  therefore  degrade  as  the  agent  accrues  knowledge,  leading  to  a  utility 
problem  (Minton,  1990),  which  we  briefly  discussed  in  Chapter  3.  Our  solution  is  to 
automatically  build  S  and  D  from  the  contents  of  previous  explanations,  which  we  described 
above  in  section  4.4.  This  does  not  guarantee  that  the  agent  has  access  to  all  relevant 
information,  but  we  do  not  assume  that  people  have  this  psychological  capability. 
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Given  a  new  explanandum  m  or  a  new  scenario  microtheory  Mv,  the  agent  builds  model 
formulation  contexts  (S  and  D)  from  previous  explanations,  as  described  above.  There  are  two 
separate  procedures  for  retrieving  previous  explanations,  shown  in  Figure  20:  (1)  find-relevant- 
explanations-for-scenario  is  used  when  the  agent  encounters  a  new  scenario  microtheory  such 
as  a  comic  graph  and  (2)  find-relev ant-explanations-for-explanandum  is  used  when  only  an 
explanandum  or  query  is  provided,  without  an  accompanying  scenario  microtheory. 


Similarity-based  retrieval  of  explanations  from  situations  and  cases. 


procedure  find-relevant-explanations-for-explanandum  (explanandum  m) 

II  Use  MAC/FAC  to  find  similar  explanandums  in  M,  using  m  as  a  probe, 
let  SimExplanandums  =  macfac(m,  M) 

//  Return  the  explanation  mappings  for  the  similar  explanandums. 
return  {( m',x )  EE:m'  E  SimExplanandums } 

procedure  find-relevant-explanations-for-scenario  (microtheory  Ms) 

II  The  case  library  is  all  scenario  microtheories  of  previous  explanandums 
let  CaseLib  =  All_scenario_microtheories  —  Ms 
II  Use  MAC/FAC  to  find  similar  cases,  using  Ms  as  a  probe, 
let  SimMicrotheories  =  macfac(Ms,  CaseLib ) 

//  Find  the  explanandums  for  these  similar  micro  theories. 

let  Explanandums  =  {m'  E  M\microtheoryOf(m ')  6  SimMicrotheories} 

II  Return  the  explanation  mappings  for  these  explanandums. 

return  {( m',x )  EE:m'  E  Explanandums } 


Figure  20:  Pseudo-code  for  best  explanation  retrieval  algorithms,  which  use 
MAC/FAC  to  find  explanations  that  are  relevant  for  a  given  explanandum  or  case. 

The  procedure  find-relevant-explanations-for-scenario  uses  MAC/FAC  to  retrieve  previous 
scenario  micro  theories  that  are  similar  to  the  new  scenario.  It  then  returns  all  preferred 
explanations  of  the  explanandums  in  these  similar  scenario  microtheories.  Similarly,  the 
procedure  find-relevant-explanations-for-explanandum  retrieves  similar  explanandums  to  the 
new  explanandum  and  then  returns  all  preferred  explanations  of  the  similar  explanandums. 
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Once  the  agent  has  retrieved  a  set  X  of  explanations,  it  can  construct  D  as  the  union  of  model 
fragments  used  inX,  and  S  as  the  union  of  beliefs  in  Ms  and  all  beliefs  B  in  explanations 
{],  B,  M)  G  X.  We  call  this  preferred  explanation  reuse.  If  no  previous  explanations  exist,  or  if 
no  explanations  can  be  constructed  by  binding  S  and  D  in  this  fashion,  then  the  system  sets  S  to 
D,  and  D  to  the  set  of  all  model  fragments  in  D.  The  simulations  in  Chapters  6,  7,  and  8  use  this 
general  pattern  for  building  the  S  and  D  contexts  for  model  formulation. 

Using  preferred  explanations  to  seed  new  explanations  has  a  side  effect:  the  contents  of 
preferred  explanations  are  propagated  to  new  contexts,  and  the  contents  of  non-preferred 
explanations  are  not.  This  is  a  positive  feedback  cycle:  if  an  explanation  is  preferred,  its  contents 
are  more  likely  to  be  reused,  which  makes  the  contents  more  likely  to  be  part  of  a  new  preferred 
explanation. 

So  far,  we  have  described  several  characteristics  of  explanations  in  our  cognitive  model:  the 
process  by  which  they  are  constructed;  how  they  organize  information;  how  they  coexist  and 
compete;  and  how  they  are  retrieved  and  reused  to  explain  new  phenomena.  Next  we  discuss 
reasoning  processes  for  evaluating  explanations  and  calculating  preferences. 

4.6  Finding  the  preferred  explanation 

The  simulations  described  in  this  dissertation  use  the  above  network  structure  to  organize 
knowledge  and  aggregate  explanations,  but  they  use  two  different  methods  of  computing 
preferred  explanations: 

1 .  Epistemic  preferences  are  preferential  relations  over  explanations  and  domain 
knowledge.  They  are  computed  using  logical  rules  and  stored  as  statements  in 
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metaknowledge.  The  preference  ordering  over  a  set  of  competing  explanations  is  used  to 
detennine  which  is  preferred  for  an  explanandum. 

2.  Cost  functions  map  each  explanation  to  a  real  number  indicating  its  absolute  suitability, 
given  what  is  already  believed  in  the  adopted  domain  knowledge  microtheory  Da  and  in 
other  preferred  explanations.  The  best  explanation  is  the  one  with  the  lowest  numerical 
cost. 

Chapter  6  describes  a  simulation  that  uses  a  cost  function,  and  Chapters  7  and  8  describe 
simulations  that  use  epistemic  preferences.  We  discuss  these  in  the  following  sections,  and  we 
include  ideas  for  integrating  these  two  approaches  in  the  conclusion  of  this  dissertation. 

4.6.1  Rule-based  epistemic  preferences 

Sometimes  a  model  fragment  or  entity  from  one  explanation  can  be  objectively  compared  to  a 
model  fragment  or  entity  in  another  explanation,  and  this  helps  decide  which  explanation  is 
better.  For  instance,  the  entity  left-heart  -  comprised  of  the  left-atrium  and  left-ventricle  -  is 
objectively  more  specific  than  the  entity  heart.  If  a  rule  in  ID)  states  that  “if  x  is  a  sub-region  of  v, 
then  x  is  more  specific  than  v,”  then  the  agent  can  encode  a  specificity-based  epistemic 
preference  for  left-heart  over  heart. 

In  our  model,  an  epistemic  preference  (hereafter  “preference”)  is  a  binary  relation  over  two 
units  of  knowledge.  Each  preference  a  <f  b  indicates  that  knowledge  b  is  strictly  preferred  to 
knowledge  a  along  dimension  d  (e.g.,  specificity,  in  the  above  example)  over  knowledge  type  t 
(i.e.,  concepts  c,  model  fragment  instances  mfi,  or  explanations  xp).  The  preference  between  left- 
heart  and  heart  entities  is  shown  in  Figure  13(b)  as  a  preference  between  concept-level  beliefs 
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<c.  To  be  more  specific,  we  would  write  (isa  heart  Heart)  <sc  (isa  1-heart 
(LeftRegionFn  Heart)  ) .  The  dimensions  of  preference  used  in  our  simulations  include: 
specificity  (5);  instructional  support  (/);  existence  prior  to  instruction  (/;);  completeness  (c);  and 
revision  (r).  We  discuss  the  criteria  for  computing  preferences  in  these  dimensions  below. 

Preferences  b\  <cb2  between  concepts  (i.e.,  beliefs,  model  fragments,  or  quantity 
specifications)  b\  and  b2  are  computed  via  logical  criteria.  Importantly,  if  b\  and  b2  are  identical 
or  comparable  for  specificity  (i.e.,  b±  <sc  b2  or  b2  <sc  b±),  we  say  they  are  s-comparable.  The 
term  “commensurable”  might  apply  here  as  well,  but  we  have  already  defined  it  in  Kuhn’s 
(1962)  and  Carey’s  (2009)  tenns  and  avoid  it  here  to  reduce  confusion.  Criteria  for  concept- 
level  preferences  are  as  follows: 


Preference 

Encoded  if  and  only  if 

h  <c  b2 

Belief  or  model  fragment  b\  is  more  specific  than  b2  as  inferred  by  some 

rule(s)  in  the  domain  theory  ID). 

bi  b2 

b\  and  b2  are  s-comparable;  b\  is  supported  by  instruction  and  b2  is  not. 

b1  <nc  b2 

b\  and  b2  are  s-comparable;  b\  is  prior  knowledge  (i.e.,  believed  prior  to 

instruction)  and  b2  is  not. 

h  <  b2 

b\  and  b2  are  model  fragments  or  quantity  specifications,  and  b2  is  a 

heuristic-based  revision  of  b\  (see  section  8.2  for  details). 
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Provided  concept-level  preferences  <c  over  domain  knowledge,  a  preference  i\  h 
between  model  fragment  instances  i\  and  h  is  derived  from  them.  These  are  largely  influenced 
by  concept-level  preferences  <c. 


Preference  Encoded  if  all  of  the  following  criteria  are  true 


de{s,i,n,r }  . 
<mfi  l2 


i\  and  z3  are  instances  of  the  same  model  fragment. 

At  least  one  h  participant  is  preferred  <c  or  <mfi to  the  same-slot  i\ 
participant  and  all  other  participants  are  s-comparable. 

No  i\  participant  is  strictly  preferred  <£?  or  <cjnfi  to  the  same-slot  T 
participant  in  the  same  dimension  d  as  the  previous  criterion. 


.  de{s,i,n,r]  ■  # 

Sn/i  l2 


z'i  and  h  are  instances  of  model  fragments  m  i  and  m2,  respectively. 
m1  <c  m2  (i.e.,  the  model  fragment  of  h  is  preferred  to  that  of  z'i). 

All  participants  of  U  are  either  identical  or  preferred  <c  to  the  same-slot 
participants  of  z'i  in  the  same  dimension  d  as  the  previous  criterion. 


ii  <mfi  i-2  •  h  is  more  complete  than  z'i:  z'i  contains  at  least  one  assumed  participant,23 

and  one  or  more  of  the  same-slot  i2  participants  are  not  assumed. 

•  All  non-assumed  same-slot  participants  of  z'i  and  z'2  are  s-comparable. 


23  Assumed  participants  are  represented  with  skolem  terms  (e.g.,  ( SkolemParticipantFn  mfi2  along- 
Path)  )  and  not  with  entities  from  the  scenario  (e.g.,  heart  or  1-heart).  We  discussed  the  conditions  for 
assuming  participants  in  our  description  of  the  abductive  model  formulation  algorithm. 
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Finally,  preferences  <xp  over  explanations  are  encoded  based  on  preferences  over  model 
fragment  instances. 


Preference  Encoded  if  all  of  the  following  criteria  are  true 

xi  <xp^S’l'n  r’C^  x2  *  Explanations  x\  and  x2  are  in  competition. 

•  At  least  one  model  fragment  instance  i2  of  x2  is  preferred  to  a  model 
fragment  instance  i\  ofxi  such  that  ix  <^j  i2  and  all  other  model 
fragments  are  identical. 

•  No  model  fragment  instance  i\  of  x\  is  preferred  to  a  model  fragment 
instance  i2  of  x2  such  that  i2  <mfi  h  over  the  same  dimension  d  as  the 
previous  criterion. 


We  have  described  how  preferences  over  conceptual  knowledge  (i.e.,  beliefs,  model 
fragments,  and  quantity  specifications),  model  fragment  instances,  and  explanations  are  derived. 
By  these  definitions,  preferences  between  concepts  <c  trigger  preferences  between  model 
fragment  instances  <mf„  which  in  turn  trigger  preferences  <xp  between  explanations. 

Preferences  between  explanations  decide  which  explanation  is  ultimately  preferred  and 
mapped  in  IE,  but  this  only  works  if  there  are  no  cycles  in  the  explanation  preference  ordering. 
Cycles  occur  when  an  explanation  xn  is  directly  or  transitively  preferred  over  competing 
explanation  xj  for  one  dimension,  and  X/  is  preferred  over  xo  for  another  dimension.  In  the 
mental  model  transformation  example  above,  consider  the  agent  that  starts  with  knowledge  of  the 
heart  (i.e.,  (isa  heart  Heart ))  but  not  the  left  heart  (i.e.,  (isa  1-heart  (Let tRegionFn 
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Heart )  )  )  .  Upon  learning  about  the  left  heart  from  the  equivalent  of  a  textbook,  it  will  have  the 
following  specificity,  instructional  support,  and  prior  knowledge  preferences: 

(isa  heart  Heart)  <£  (isa  1-heart  (Lef tRegionFn  Heart) ) 

(isa  heart  Heart)  <lc  (isa  1-heart  (Let tRegionFn  Heart)  ) 

(isa  1-heart  (LeftRegionFn  Heart) )  <”  (isa  heart  Heart)  . 

If  these  preferences  propagate  upward  to  preferences  over  model  fragment  instances  and 
competing  explanations,  the  following  preferences  over  explanations  could  occur: 

xp0  <sxp  xp1 
xp0  <lxp  xp1 
xpi  <%  xp0 

In  Figure  13(b),  this  cycle  in  preferences  has  been  reconciled  into  a  single  explanation-level 
preference  <xp.  This  is  achieved  with  preference  aggregation,  which  we  describe  next. 

Aggregating  epistemic  preferences 

Epistemic  preferences  along  several  dimensions  can  be  aggregated  into  a  single  dimension 
(Doyle,  1991).  Our  model  achieves  this  with  a  preference  aggregation  function.  The  input  to 
the  function  is  a  preference  ranking  sequence  R  over  all  dimensions  D  =  {s,i,n,c,r}  such  as  R  = 
(s,  i,  n,  c,  r)  or  R  =  (n,  c,  s,  i,  r).  Informally,  the  preference  ranking  describes  the  relative 
importance  of  each  dimension  of  preference,  for  cycle  resolution.  The  output  is  a  single,  acyclic, 


135 

partial  ordering  <xp  over  explanations.  This  is  implemented  by  the  following  procedure  that 
computes  the  aggregate  ordering  <xp: 

<xp^0 

for  each  d  G  R 

for  each  pref  G  <xp 

if  cycles(<xp  +  pref)  =  0  then 
<xp  <-  <xp  +  pref 

For  each  dimension  of  preference,  ordered  by  the  preference  ranking  sequence,  all 
preferences  are  added  to  the  aggregate  ordering  unless  they  result  in  a  cycle  in  the  aggregate 
ordering.  This  produces  a  partial,  acyclic  ordering  over  explanations,  assuming  that  preferences 
<xP  in  each  dimension  d  are  acyclic.  The  preference  ranking  R  thereby  influences  the  decision 
of  which  competing  explanation  is  ultimately  preferred,  which  will  affect  subsequent  learning 
and  question-answering. 

Psychological  assumptions  regarding  rule-based  epistemic  preferences 

Here  we  discuss  psychological  assumptions  underlying  our  use  of  rule-based  epistemic 
preferences.  Some  of  the  unsupported  assumptions  of  epistemic  preferences  are  resolved  by  our 
use  of  cost  functions,  which  we  describe  in  the  next  section. 

One  assumption  of  our  specificity  preference  is  that  people  prefer  more  specific  explanations 
and  concepts  over  more  general  ones,  all  else  being  equal.  This  has  been  common  practice  in  AI 
for  some  time  (e.g.,  Poole,  1985).  This  seems  intuitively  accurate  from  an  information  theoretic 
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standpoint,  since  more  general  information  can  often  be  inferred  from  more  specific  information 
(e.g.,  since  the  left-heart  pumps  blood  to  the  body,  the  heart  pumps  blood  to  the  body).  Rottman 
and  Keil  (2011)  show  that  people  attribute  more  importance  to  components  of  an  explanation 
with  more  elaboration.  This  specific  preference  does  not  assume  that  people  prefer  to  construct 
more  specific  explanations  when  communicating  to  others,  since  the  explanations  we  discuss 
here  are  self-directed. 

Having  a  prior  knowledge  preference  assumes  that  people  may  prefer  to  explain  things  in 
terms  of  entities  they  are  already  acquainted  with  (e.g.,  the  heart)  rather  than  entities  that  they 
recently  encountered  via  instruction  (e.g.,  left  ventricle).  This  is  indeed  the  case  for  students  in 
the  control  group  of  Chi  et  al.  (1994a)  who  (1)  explained  blood  flow  in  terms  of  the  heart  on  a 
pretest,  (2)  read  a  textbook  passage  (twice)  which  included  a  description  of  the  left-heart  and 
left-ventricle  pumping  blood  to  the  body,  and  (3)  still  explained  blood  flow  in  tenns  of  the  heart 
on  the  posttest.  This  is  one  manner  in  which  we  model  resistance  to  change,  which  is  a  notable 
problem  in  achieving  conceptual  change  (for  detailed  discussion  of  resistance,  see  Feltovich  et 
al.,  1994;  Chinn  and  Brewer,  1993). 

Our  instructional  support  preference  assumes  that  people  prefer  information  that  is 
supported  by  instruction  over  comparable  information  that  is  not.  This  is  supported  by  Chi  et  al. 
(1994a),  who  document  students  changing  their  mental  models  when  they  realize  that  their 
beliefs  are  inconsistent  with  a  textbook  passage. 

Our  completeness  preference  assumes  that  people  prefer  explanations  that  make  fewer 
existence  assumptions,  all  else  being  equal.  We  have  defined  an  assumption  as  a  statement  that 
is  not  readily  observed  or  justified,  so  all  else  being  equal,  assumptions  increase  uncertainty  and 
decrease  the  simplicity  of  an  explanation.  Lombrozo  (2007)  provides  evidence  that  people  prefer 
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simpler  explanations,  and  that  they  believe  simpler  explanations  to  be  more  probable,  all  else 
being  equal. 

Epistemic  preferences  describe  one-dimensional  dominance  between  concepts,  model 
fragments,  and  explanations.  They  are  sufficient  for  simulating  the  conceptual  changes 
described  in  Chapters  7  and  8,  but  we  do  not  assume  that  this  is  a  complete  model  of 
psychological  explanation  evaluation.  People  have  other  criteria  by  which  they  judge  causal 
explanations,  including  causal  simplicity,24  coverage  of  observations,  goal  appeal,  and  narrative 
structure  (Lombrozo,  2011).  We  next  discuss  a  how  a  cost  function  -  used  in  the  simulation  in 
Chapter  6  -  can  capture  some  of  these  macro-level  qualities. 

4.6.2  Cost  functions 

In  many  cases,  preferences  over  individual  concepts  cannot  sufficiently  capture  what  makes  one 
explanation  better  than  another.  There  are  many  other  considerations  when  evaluating  an 
explanation:  How  simple  is  it?  How  does  it  cohere  with  other  explanations  I’ve  constructed? 
Does  it  have  consistent  causal  structure?  Our  cost  function  -  used  in  the  simulation  in  Chapter  6 
to  compute  explanation  preferences  -  is  designed  to  answer  these  questions.  In  this  section  we 
describe  the  cost  function  and  the  elements  of  explanations  that  incur  costs. 

A  cost  function  is  a  numerical  rating  of  the  additional  complexity  that  an  explanation  would 
incur  the  agent.  It  computes  this  by  summing  the  cost  of  epistemic  artifacts  that  would  be 
incurred  by  accepting  an  explanation  (i.e.,  mapping  an  explanandum  to  it  in  E).  Epistemic 
artifacts  (hereafter  “artifacts”)  include  assumptions,  contradictions,  quantity  changes,  model 

24  Lombrozo  (2011)  describes  simplicity  as  perceived  probability,  but  it  has  also  been  formulated  as  the 
minimization  of  assumptions  (Ng  &  Mooney,  1992)  or  the  minimization  of  assumption  cost  (Charniak  &  Shimony, 
1990). 
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fragments,  and  more  (in  table  fonn,  below).  If  an  artifact  within  an  explanation,  e.g.,  an 
assumption,  is  already  used  within  another  preferred  explanation  in  IE,  that  artifact  does  not  add 
to  the  cost  of  the  explanation  in  question.  When  multiple  explanations  compete  to  explain  an 
explanandum  m,  the  minimum-cost  explanation  x  is  chosen  as  the  preferred  explanation  so  that 
{ m ,  x)  is  added  to  IE.  Next  we  catalog  the  types  of  explanation  artifacts  and  describe  how 
explanation  costs  are  computed. 

Each  artifact  is  identified  by  domain-general  rules  and  patterns,  and  each  has  a  numerical 
cost.  The  cost  of  an  explanation  x  =  (J,  B,  M)  is  computed  as  the  cost  of  all  new  artifacts  that 
would  be  incurred  by  accepting  x’s  beliefs  B.  For  instance,  B  may  contain  new  assumptions, 
new  model  fragment  instances,  and  new  beliefs  that  contradict  beliefs  in  adopted  domain 
knowledge  Oa  or  in  preferred  explanations  in  IE.  As  mentioned  above,  only  new  artifacts  incur  a 
cost,  so  there  is  a  strong  bias  for  explaining  new  explanandums  with  pre-existing  assumptions 
and  mechanisms. 

Each  artifact  a  is  uniquely  defined  by  the  tuple  a  =  (ta,  Ba),  where 

•  ta  is  the  artifact  type  (e.g.,  Assumption),  which  detennines  the  cost  of  a.  Types  and 
associated  costs  are  listed  below. 

•  Ba  is  a  set  of  requisite  beliefs,  such  that  the  cost  of  a  is  incurred  if  and  only  if  all  Ba  are 
believed  (i.e.,  Ba  is  a  subset  of  the  union  of  Da  and  the  beliefs  of  all  preferred 
explanations  in  E). 


We  use  this  notation  to  describe  artifacts  in  Chapter  7. 
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Let  A  =  {cio,  . . .,  an}  be  the  set  of  all  artifacts  and  let  A;  Q  A  the  set  of  incurred  artifacts  (i.e., 
whose  costs  are  incurred  by  the  agent).  An  artifact  is  a  member  of  Ai  exactly  when  each  of  its 
requisite  beliefs  in  Ba  is  in  Da  or  in  some  preferred  explanation  in  IE.  For  ease  of  discussion,  we 
can  define  the  union  of  adopted  beliefs  of  the  agent  Ua  as  all  beliefs  in  the  adopted  domain  theory 
and  in  preferred  explanations: 


lJfl  =  Dfl  U  |^J  B\x  =  (J,B,  M) 

(m,x)e  E 


We  can  now  compute  the  set  of  incurred  artifacts  A,  as  all  artifacts  in  A  whose  beliefs  Ba  are  in 
Ua: 


^  —  {(ta,  Ba)  E  A\  Ba  Q  Ua] 

We  list  the  artifact  types  ta  used  in  the  simulation  in  Chapter  6,  and  we  describe  how 
requisite  beliefs  Ba  of  each  type  are  computed.  Importantly,  one  type  of  artifact  has  a  negative 
cost,  so  it  provides  a  utility  to  the  agent  rather  than  a  penalty. 


ta :  cost 

Ba  constituents 

Contradiction:  100 

Ba  is  any  set  of  beliefs  such  that  the  conjunction  of 

beliefs  Ba  -  and  no  strict  subset  thereof  -  is  inconsistent. 

Asymmetric  quantity  change:  40 

Ba  =  !  b } ,  where  b  is  a  statement  in  an  explanation  x’s 
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metaknowledge  Bm  that  describes  a  quantity  change  in  x 

that  does  not  have  a  reciprocal  quantity  change  in  a 

cyclical  state-space.25 

Assumed  quantity  change:  30 

Ba  =  {b},  where  b  is  an  assumed  quantity  change.  These 

are  costly  because  quantity  changes  must  be  explained 

at  some  point  by  introducing  a  process  instance,  since 

processes  are  the  sole  mechanism  of  change  in  a 

physical  system  (Forbus,  1984). 

Model  fragment:  4 

Ba  =  {  (isa  mf  ModelFragment)  },  where  mf  is  a 

model  fragment,  e.g.,  ContainedFluid  in  the 

circulatory  system  micro-example. 

Assumption:  3 

Ba  =  {b},  where  b  is  an  assumed  proposition. 

Model  fragment  instance:  2 

Ba  =  {  (isa  inst  mf)  }  where  inst  is  the  instance  name 

and  mf  is  the  model  fragment  type,  e.g.,  (isa  mf  iO 

ContainedFluid)  in  the  circulatory  system  micro¬ 
example. 

Credibility:  [-1000,  0) 

Ba  =  {  b  } ,  where  b  was  communicated  from  another 

source.  The  utility  (i.e.,  negative  cost)  of  accepting  b  is 

25  Asymmetric  quantity  changes  are  possible  in  any  cyclic  state  space,  such  as  the  water  cycle,  the  carbon  cycle, 
breathing,  the  seasons,  and  day/night.  The  day/night  explanation  "night  turns  to  day  in  Chicago  because  the  earth 
rotates  so  that  Chicago  faces  the  sun,  and  day  turns  to  night  in  Chicago  because  clouds  cover  the  sun”  is  asymmetric: 
there  is  no  mention  of  how  the  earth  rotates  to  block  Chicago  from  the  sun  for  the  next  sunrise.  We  provide  more 
examples  of  asymmetric  quantity  changes  in  Chapter  6. 
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proportional  to  the  credibility  of  the  source. 


The  artifacts  and  costs  listed  above  are  sufficient  for  simulating  the  mental  model 
transfonnation  in  Chapter  6,  but  we  do  not  believe  this  list  is  complete.  Also,  the  costs  listed 
above  were  determined  empirically  to  maximize  the  accuracy  of  the  simulation  in  Chapter  6,  so 
we  had  to  make  several  psychological  assumptions  which  we  discuss  below. 

According  to  Occam’s  razor,  a  simpler  explanation  is  better,  all  else  being  equal.  The 
penalties  for  model  fragments  and  their  instances  promotes  qualitative  parsimony  (i.e., 
minimizing  the  new  kinds  of  entities  postulated)  and  quantitative  parsimony  (i.e.,  minimizing  the 


Explanation  and  belief  cost  computation 


//  Compute  the  cost  that  would  be  incurred  by  adopting  an  explanation, 
procedure  explanation-cost  (explanation  x  —  (J,B,  M )) 

//  Find  artifacts  Ax  pertaining  to  x  that  are  not  presently  incurred, 
let  A,  =  {(ta,Ba)  G  A/At:  Ba  n  B  A  0} 

//  Find  artifacts  A  incurred  if  x  were  adopted.  Recall  HJa  is  adopted  beliefs, 
let  A  -  {( ta,Ba )  G  Ax:Ba  G  (5  U  Ua)} 

//  Return  the  sum  of  the  costs  of  these  artifacts, 
return  YaeA  cost(a) 

//  Compute  the  cost  that  can  be  saved  by  retracting  a  belief, 
procedure  retraction-savings  (belief  b) 

II  If  b  is  not  in  a  preferred  explanation. . . 
if  3(m,  (/,  B,M))  G  (b  G  B)  then 

//  Find  artifacts  A  supported  by  b  that  are  presently  incurred. 

\etA  =  {(ta,Ba)  EAi-.b  E  Ba} 

II  Return  the  sum  of  the  cost  of  these  artifacts, 
return  —  YaeA  cost(a) 
else  return  0 


Figure  21:  Pseudo-code  for  computing  an  explanation’s  cost  and  a  belief’s  cost  using 
a  cost  function.  Note  that  the  cost  of  any  explanation  that  is  presently  adopted  (i.e., 
an  explanandum  is  mapped  to  it  in  IE)  is  zero. 
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number  of  new  entities  postulated),  respectively.  Promoting  parsimony  and  penalizing 
assumptions  makes  a  simpler  explanation  less  costly,  all  else  being  equal. 

The  cost  function  is  used  for  two  purposes  in  our  computational  model:  (1)  computing  a 
preferred  explanation  from  multiple  competing  explanations  and  (2)  retrospectively  changing  Da 
and  IE.  In  the  case  of  explanation  competition,  a  new  explanandum  m  (e.g.,  the  changing  of 
Chicago’s  seasons,  in  Chapter  6)  is  explained  by  the  agent,  and  multiple  explanations  X compete 
to  explain  m.  The  cost  function  is  used  to  decide  which  explanation  x  EXto  associate  with  m  as 
its  preferred  explanation  in  IE.  Computing  the  cost  of  an  explanation  x  is  equivalent  to 
computing  the  total  cost  of  all  artifacts  in  A/A;  that  would  be  added  to  Aj  if  (m,  x)  E  IE.  This 
algorithm  is  shown  in  Figure  21.  The  agent  uses  the  function  explanation-cost  to  find  the 
minimal  cost  explanation  in  X. 

The  cost  function  is  also  used  to  retrospectively  change  Da  and  IE  to  reduce  cost.  For 
instance,  it  could  be  the  case  that  the  cost  of  A;  could  be  significantly  reduced  by  switching  the 
preferred  explanation  for  some  explanandum(s)  in  IE  or  removing  some  belief(s)  from  Da. 
Consider  the  sequence  of  events  in  Figure  22  that  occurs  in  a  simulation  trial  in  Chapter  6:  the 
agent  makes  the  locally  optimal  choices  for  two  explanations,  but  then  leams  some  new 
information  that  renders  the  two  explanations  mutually  inconsistent,  despite  retaining  their 
individual  internal  consistency.  In  this  situation,  the  contradictions  may  be  removed  by 
removing  the  credible  beliefs  bo  or  h;  from  Da  (thus  losing  the  credibility  bonus)  or  changing  the 
preferred  explanation  for  Chicago’s  seasons  (explanandum  mo),  Australia’s  seasons 
(explanandum  mi),  or  both.  Making  these  changes  may  alter  the  set  of  beliefs  HJa  that  the  agent 
holds  to  be  true,  so  this  is  a  mechanism  of  belief  revision. 
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What  the  agent  does 

In  our  model: 

1.  Explained  Chicago’s  difference  in  summer  and  winter 
temperatures  (explanandum  mo)  with  an  explanation  xo  of 
the  earth  being  closest  to  the  sun  in  Chicago’s  summer  and 
being  furthest  from  the  sun  in  Chicago’s  winter. 

m0  EM 

(m0,x0)  E  E 

2.  Explained  Australia’s  difference  in  summer  and  winter 
temperatures  (explanandum  m  \ )  with  the  similar 
explanation  x\  to  xo,  using  the  same  mechanisms  and 
assumptions. 

m1  E  M 

(m1,x1)  E  E 

3.  Learned  from  a  credible  source  that  Australia’s  winter 
coincides  with  Chicago’s  summer  (belief  bo)  and 

Australia’s  summer  coincides  with  Chicago’s  winter 
(belief  b\). 

(V  ^l)  — 

(Credibility  ,{b0})  E 
(Credibility,  {b^)  E 

4.  Detected  four  contradictions  due  to  bo,  b\,  and  the 
beliefs  inxo  andxi,  e.g.,  the  earth  cannot  be  closest  to  the 
sun  in  Chicago’s  summer  and  farthest  in  Australia’s  winter 
at  the  same  time,  since  they  temporally  coincide. 

Four  contradiction  artifacts 
added  to  A;,  of  the  form 
(Contra,  {h0,h1;hx0,hxl}) 
where  bx o  and  bx\  are  beliefs 
from  x0  and  x\,  respectively. 

Figure  22:  A  sequence  of  events  from  the  simulation  in  Chapter  6  that  produces 

several  contradictions  between  best  explanations  and  credible  domain  knowledge. 

Restructuring  the  entire  contents  of  Da  and  remapping  all  explanandums  in  E  to  find  the 

minimal  cost  configuration  is  very  costly.  This  is  due  to  the  number  of  possible  mappings  in  E 
and  beliefs  in  Da  that  must  be  considered.  If  there  are  m  explanandums  with  x  explanations  each 
and  b  domain  beliefs  which  can  be  either  adopted  (i.e.,  in  Da)  or  not  (i.e.,  in  D\Da),  then  there  are 
2bxm  possible  configurations.  If  there  are  16  explanations  for  both  Chicago  and  Australia,  this  is 
equivalent  to  2“*16“=  1024  configurations  for  just  the  two  explanations  and  two  domain  beliefs 
in  Figure  22.  While  this  is  not  a  serious  problem  for  this  small  example,  the  time  complexity  is 
exponential  on  the  number  of  explanandums  being  considered.  We  avoid  this  combinatorial 
explosion  by  using  the  greedy,  local  reconstruction  algorithm  shown  in  Figure  23. 
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Our  local  reconstruction  algorithm  takes  a  single  artifact  as  input,  finds  the  domain  beliefs 


Locally  restructuring  the  KB 


function  restructure-around-artifact  (artifact  a  =  (ta,Ba)) 

//  Find  supporting  explanandums 

let  Ma  =  {Mt  G  M:  (Mu  (J,  B,  M))  G  E  A  (Ba  n  5)  *  0} 

//  Find  supporting  beliefs  in  the  domain  theory 
let  Da  -BanBa 

II  Iterate  until  no  further  local  revisions  are  made, 
let  revised  —  true 
while  revised : 

set  revised  =  false 

for  each  M,  in  Ma : 

//  Find  explanations  that  can  explain  this, 
let  X  =  {( J,B,M )  G  X-.Mi  G  M} 

//  Find  the  least  cost  explanation. 

let  x  —  min  explanation-  cost(x ) 

xex 

II  Make  the  least  cost  explanation  the  best  explanation,  if  not  already, 
if  ( Mitx )  g  E  then: 

replace  ( Mt ,*)  with  { Mj,x )  in  E 
set  revised  =  true 

for  each  d  in  Da : 

//  If  this  belief  can  be  retracted  to  reduce  cost,  retract  it. 
if  retraction-savings(d)  >  0  then 

//  Remove  d  from  adopted  beliefs, 
set  Da=  Da  —  d 


Figure  23:  Algorithm  for  restructuring  knowledge  based  on  the  presence  of  a  high- 

cost  artifact. 

and  explanations  that  support  it,  and  greedily  reconfigures  the  beliefs  and  affected  explanandums 
to  reduce  the  cost.  This  involves  remapping  individual  explanandums  in  E  and  adding  or 
removing  beliefs  in  Da.  Each  explanandum  under  consideration  changes  its  mapping  in  E  to  a 
lower-cost  explanation  (if  there  is  one),  and  each  belief  in  Da  under  consideration  is  added  or 
removed  from  Da  if  it  will  reduce  cost.  This  occurs  in  a  closed  loop,  until  no  single  action  can 
reduce  the  cumulative  cost,  and  then  the  algorithm  terminates.  It  is  guaranteed  to  terminate, 


145 


since  each  action  -  and  therefore  each  loop  -  must  reduce  the  cost  of  incurred  artifacts,  and  cost 
can  only  be  finitely  reduced. 

The  series  of  unilateral  changes  in  the  restructuring  algorithm  is  not  guaranteed  to  find  the 
minimum  cost  configuration;  however,  the  average  case  performance  is  much  more  tractable 
with  respect  to  the  number  of  beliefs  and  explanandums  considered.  Using  the  same  analysis  as 
above,  the  number  of  cost  computations  on  each  loop  is  2b  +  xm,  which  equals  36.  The  number 
of  loops  varies  with  the  content  of  the  explanations,  and  a  carefully-engineered  scenario  could 
still  produce  a  worst-case  performance  of  2bxm  cost  computations  in  total,  identical  to  finding  the 
optimal  cost  above.  In  the  Figure  22  example  from  Chapter  6,  the  algorithm  takes  a  total  of  two 
loops  to  reach  a  stable  configuration.  This  required  72  cost  computations  instead  of  1024  in  the 
worst  case  for  the  same  circumstance.  The  algorithm  is  not  guaranteed  to  remove  the  artifact 
that  was  provided  as  input;  rather,  the  input  artifact  is  used  as  a  marker  for  possible  cost 
optimization. 

Psychological  assumptions  regarding  cost  functions 

Cost  functions  capture  psychological  explanation  preferences  that  are  not  possible  using  rule- 
based  epistemic  preferences  alone;  however,  they  make  some  additional  assumptions  about  how 
people  evaluate  explanations. 

Several  psychological  assumptions  underlie  the  types  of  artifacts  that  incur  a  cost  in  our 
model.  In  our  model,  process  instances  and  quantity  changes  are  the  mechanisms  and  effects  of 
a  dynamic  system,  respectively.  These  comprise  the  root  and  intermediary  causes  within  a 
system.  People  prefer  explanations  with  fewer  causes,  all  else  being  equal  (Lombrozo,  2007),  so 
it  is  sensible  to  penalize  process  instances  and  quantity  changes. 
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By  penalizing  contradictions,  we  assume  that  people  desire  consistency  within  and  across 
explanations.  This  assumption  is  common  to  the  other  theories  of  conceptual  change  in  Chapter 
2,  and  it  is  clearly  supported  in  interviews  (e.g.,  Sherin  et  ah,  2012)  where  students  revise  their 
explanations  when  they  detect  inconsistencies. 

We  assume  that  an  explanation’s  quality  is  not  solely  determined  by  its  probability.  When 
we  refer  to  an  explanation’s  probability,  we  mean  the  joint  probability  of  the  explanation’s 
assumptions  relative  to  other  adopted  beliefs.  To  illustrate,  here  is  how  we  might  compute  the 
most  preferable  explanation  using  probability  alone:  we  use  probabilities  to  represent  the  agent’s 
purported  likelihood  of  a  given  belief,  and  then  search  for  a  maximum  a-posteriori  (MAP)  truth 
value  assignment  to  all  existing  assumptions.  The  explanation  that  conforms  to  this  set  of 
assumptions  would  be  the  preferred  explanation.  We  could  then  model  people’s  simplicity 
preference  (i.e.,  minimizing  the  number  of  causes,  similar  to  above)  by  assigning  more  complex 
causes  a  lower  prior  probability.  Finally,  we  can  avoid  contradictions  by  encoding  a  zero  for  the 
joint  probability  of  mutually  inconsistent  beliefs  (e.g.,  {b0,  bv  bx0,  bxl }  in  Figure  22).  Thus, 
when  new  knowledge  causes  an  explanation,  the  agent  could  revise  its  explanation  by  searching 
for  more  probable  truth  value  assignments  for  assumptions. 

The  alternative,  purely  probabilistic  model  we  have  just  described  makes  a  very  strong 
assumption  that  we  do  not  make  in  our  computational  model:  assignments  of  truth  values  to 
assumptions  that  are  equally  probable  are  equally  preferable  to  people.  To  illustrate  why  this  is 
problematic,  consider  a  student  with  two  contradictions  (±a  and  1*)  in  his  adopted  beliefs.  Since 
±a  or  1  b  alone  will  result  in  a  probability  of  zero,  resolving  la  while  still  exists  does  not 
measurably  improve  the  student’s  interpretation  of  the  world,  so  no  action  need  be  taken.  This  is 
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not  to  suggest  that  a  purely  probabilistic  approach  to  evaluating  explanations  is  infeasible,  but  it 
would  require  additional  considerations  for  evaluating  explanations  both  locally  and  globally. 

Our  cost  function  assigns  all  assumptions  an  identical  cost,  but  it  is  not  likely  that  people 
view  all  assumptions  as  equally  desirable.  This  could  be  improved  by  representing  the 
uncertainty  of  beliefs  -  potentially  using  probabilities  -  and  then  computing  the  cost  of  an 
assumption  as  a  function  of  uncertainty.  We  discuss  this  further  in  Chapter  9. 

4.7  Retrospective  explanation 

In  our  discussion  of  cost  functions,  we  described  a  restructuring  algorithm  that  manipulates 
previously  explained  beliefs  and  transitions  support  to  lower-cost  explanations.  This  requires 
that  previous  explanations  are  already  present  for  evaluation  and  potential  transition. 

Importantly,  beliefs  and  model  fragments  may  have  been  added  to  ID)  since  an  explanandum  was 
encountered,  so  the  agent  might  be  able  to  construct  a  better  explanation  than  presently  exists. 
Retrospective  explanation  is  the  process  of  constructing  new  explanations  for  previous 
explanandums. 

The  first  task  in  retrospective  explanation  is  to  detect  opportunities  for  retrospective 
explanation.  Adding  knowledge  to  the  domain  theory  D  can  change  the  space  of  possible 
explanations  for  an  explanandum,  but  not  all  expansions  of  ID)  affect  all  explanandums  M.  In  the 
simulations  described  in  Chapters  7  and  8,  concept  preferences  <c  dictate  opportunities  for 
retrospective  explanation. 
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Figure  24:  Model  fragment  ArterialFlow  is  preferred  over  FluidFlow  due  to  greater 
specificity,  but  leftH2B  has  not  yet  been  explained  using  the  preferred  knowledge. 

As  illustrated  in  Figure  24,  explanandum  leftH2B  is  explained  with  a  model  fragment 
FluidFlow,  but  not  with  the  preferred  model  fragment  ArterialFlow.  This  might  occur  if 
ArterialFlow  is  a  more  specific  <£  model  fragment,  but  it  was  learned  after  leftH2B  was 
explained.  A  similar  pattern  could  occur  when  revising  models  of  force  and  motion:  the 
observation  that  a  ball  is  rolling  to  the  left  has  been  explained  with  a  model  mo  of  force-driven 
movement,  but  m0  has  since  been  revised  as  in /  such  that  m0  <rc  mi.  In  both  of  cases,  a  preferred 
model  fragment  was  not  present  when  an  explanandum  was  explained.  A  retrospective 
explanation  opportunity  exists  in  both  of  these  cases.  More  generally,  a  retrospective 
explanation  opportunity  exists  any  time  a  concept  c  has  been  used  to  explain  an  explanandum  m 
and  a  preferred  concept  c  ’  (i.e.,  c  <c  c  ’)  has  not  been  attempted  for  use  with  that  explanandum. 
Every  retrospective  explanation  opportunity  will  thus  be  a  triple  of  a  belief  plus  a  pair  of 


concepts. 
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The  simulations  in  Chapters  7-8  search  for  retrospective  explanation  opportunities  any  time 
concept-level  preferences  are  computed  after  incorporating  a  scenario.26  Once  a  retrospective 
explanation  opportunity  is  found,  the  explanandum  is  explained  using  the  abductive  model 
formulation  algorithm  described  above.  This  provides  additional  support  for  previously- 
explained  beliefs  without  disrupting  existing  explanations  in  the  network.  The  evaluation 
techniques  described  above  (i.e.,  preference  computation  and  cost  function)  can  then  be  used  to 
detennine  whether  the  new  explanation  is  preferable  to  existing  ones.  This  is  how  retrospective 
explanation  results  in  belief  revision. 

With  respect  to  Figure  24,  retrospective  explanation  may  fail  to  construct  a  new  explanation 
for  lef  tH2B  using  ArterialFlow.  In  this  case,  the  triple  (leftH2B,  FluidFlow, 

Arterial  Flow)  is  stored  as  a  retrospective  explanation  failure  so  that  the  system  will  not 
attempt  retrospective  explanation  for  the  same  reason.  The  existing  explanation  xj  will  remain 
the  best  explanation  for  left H2B. 

The  agent  may  add  new  information  (i.e.,  beliefs,  models,  and  quantities)  via  inductive 
learning  (e.g.,  Chapter  5),  instruction  (e.g.,  Chapter  7),  or  heuristic-based  revision  (e.g.,  Chapter 
8),  but  these  additions  do  not  by  themselves  constitute  successful  conceptual  change.  The  agent 
ultimately  achieves  conceptual  changes  using  the  methods  described  in  this  chapter.  After 
acquiring  or  revising  infonnation,  the  agent  propagates  it  to  new  contexts  and  scenarios  by  using 
it  to  explain  new  and  previous  phenomena.  If  the  new  explanations  are  preferable  to  prior  ones, 
the  agent  re-justifies  its  beliefs  with  new  explanations.  The  agent  can  thereby  adopt  new 
combinations  of  information  and  new  representations  in  the  presence  of  conflicting  knowledge, 
which  is,  by  definition,  conceptual  change. 

2,1  In  situations  where  the  agent  does  not  have  time  to  reflect  on  previous  scenarios,  retrospective  explanation  can  be 
delayed  until  a  later  time.  We  discuss  the  implications  of  delaying  retrospective  explanation  -  and  ways  to 
experimentally  measure  the  effects  -  in  section  9.4. 
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Chapter  5:  Learning  intuitive  mental  models  of  motion  from  observation 

Conceptual  change  does  not  begin  with  a  blank  slate.  This  chapter  presents  a  simulation  of  how 
intuitive  models  can  be  learned  from  a  sequence  of  observations.  This  provides  an  account  of 
how  flawed  mental  models  are  formed  as  a  precursor  to  conceptual  change,  but  it  does  not  in 
itself  constitute  conceptual  change.  Other  systems  have  learned  humanlike  misconceptions  from 
examples  (e.g.,  Esposito  et  al.,  2000),  but  with  different  methods  and  knowledge  representations, 
as  we  discuss  in  Chapter  9. 

Students’  pre- instructional  knowledge  has  been  explored  in  the  cognitive  science  literature 
in  many  domains.  This  knowledge  is  also  referred  to  as  preconceptions,  intuitive  theories,  and  - 
when  inconsistent  with  scientific  theories  -  misconceptions  or  alternate  conceptions .  Pre- 
instructional  knowledge  in  scientific  domains  (e.g.,  dynamics  and  biology)  is  presumably  learned 
via  observation  and  interaction  with  the  world.  The  simulation  described  in  this  chapter  provides 
a  computational  account  of  how  descriptive  mental  models  of  dynamics  might  be  learned  via 
observations.27 

We  use  the  term  descriptive  mental  models  here  because  the  models  learned  by  this 
simulation  describe  what-follows-what  without  specifying  conceptual  mechanisms  and  physical 
processes  that  cause  change.  Consider  the  following  system  of  beliefs: 

When  an  object  a  is  moving  in  the  direction  d  of  another  object  b : 
a  might  touch  object  b  and  push  it  in  direction  d,  in  which  case: 
b  may  block  a,  or 
b  may  move  in  direction  d. 


27  This  chapter  expands  the  original  account  described  in  Friedman,  Taylor,  and  Forbus  (2009). 
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This  simple  descriptive  account  of  dynamics  is  incomplete  and  it  does  not  appeal  to  any 
conceptual  quantities  such  as  force,  inertia,  or  impetus,28  but  it  still  has  considerable  predictive 
and  explanatory  power.  It  contains  temporal  constraints  (i.e.,  one  state  or  event  follows  another 
in  time),  it  is  abstract  (i.e.,  it  does  not  mention  specific  types  of  objects  such  as  “ball”  or 
“block”),  and  it  is  parameterized  (i.e.,  it  can  occur  for  multiple  directions  d). 

The  structure  of  this  simulation  is  shown  in  Figure  25.  The  input  to  the  system  is  (1)  a  set  of 
event  types  to  model  and  (2)  a  sequence  of  scenarios,  implemented  via  microtheories,  for 
learning  about  this  set  of  events.  The  system  first  finds  instances  of  the  event  types  within  the 
stimuli  and  constructs  temporally-encoded  cases  for  each  event  instance.  Next,  SAGE  is  used  to 
construct  generalizations  of  each  type  of  event.  These  generalizations  are  subsequently  filtered 
and  converted  into  qualitative  models. 

To  evaluate  what  is  learned,  the  resulting  qualitative  models  are  used  on  two  problem¬ 
solving  tasks  from  the  learning  science  literature:  one  from  Brown  (1994),  and  one  from  the 
Force  Concept  Inventory  (Hestenes  et  al.,  1992).  This  helps  us  determine  whether  the  learned 
qualitative  models  can  simulate  the  pre-instructional  mental  models  of  students.  We  are 


Events  to  learn: 


Figure  25:  Topology  of  the  Chapter  5  simulation. 


28 


See  Chapter  8  for  a  simulation  that  generates  and  uses  conceptual  quantities. 
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principally  interested  in  simulating  students’  misconceptions  -  recall  that  the  objective  of  this 
simulation  is  to  model  how  students  leam  from  observation,  and  students  do  not  arrive  at  correct 
Newtonian  models  using  observation  alone.  This  simulation  provides  evidence  to  support  the 
first  two  claims  of  this  dissertation: 

Claim  1 :  Compositional  qualitative  models  provide  a  consistent  computational  account  of 
human  mental  models. 

Claim  2\  Analogical  generalization,  as  modeled  by  SAGE,  is  capable  of  inducing  qualitative 
models  that  satisfy  Claim  1 . 

The  other  simulations  provide  additional  support  for  Claim  1,  but  no  other  simulation  provides 
support  for  Claim  2.  Importantly,  the  qualitative  models  learned  in  this  simulation  do  not 
describe  continuous  causal  mechanisms.  This  is  because  SAGE  does  not  hypothesize  causal 
mechanisms  such  as  processes  and  quantities  where  none  are  already  believed  to  exist.  We  next 
describe  our  simulation,  including  the  training  and  testing  data,  the  learning  processes,  and  a 
comparison  to  human  mental  models. 

5.1  Using  multimodal  training  data 

The  training  data  for  this  simulation  is  multimodal  because  each  training  case  is  created 
using  two  different  modes  of  input:  sketches  and  simplified  English.  This  is  a  simplified 
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7Q 

approximation"  of  what  a  person  might  encounter  in  daily  experience.  Each  case  contains 
relational  knowledge  that  CogSketch  encoded  from  hand-sketched  comic  graphs  such  as  Figure 
26.  Each  case  also  contains  knowledge  that  the  natural  language  understanding  system  EA  NLU 
(Tomai  &  Forbus,  2009)  semi-automatically  encodes  from  one  or  more  English  sentences  that 
describe  the  comic  graph.  The  following  English  sentences  accompany  the  comic  graph  in 
Figure  26: 

The  child  child-15  is  here. 

The  child  child-15  is  playing  with  the  truck  truck-15. 

The  car  car- 15  is  here. 


startsAfterEndingOf  startsAfterEndingOf 


- JZ 

^ - 

Pre-13 

Push- 13 

Move- 13 

Figure  26:  A  comic  graph  stimulus  created  using 
CogSketch 


Since  cross-modal  reference  resolution  is  a  difficult  open  problem,  we  factor  it  out  by  using  the 
internal  tokens  from  the  sketch  (in  italics)  within  the  sentence.  One  can  think  of  this  as 
providing  the  same  kind  of  information  that  a  teacher  would  be  giving  a  child  by  pointing  at 
objects  while  talking  about  them.  EA  NLU  uses  the  term  child-15  to  refer  to  the  Child  entity 
that  is  playing  with  the  Truck  entity  truck-15.  These  are  the  same  entity  names  used  by 

29  See  Chapter  3  for  a  discussion  of  the  psychological  assumptions  and  limitations  of  using  sketches  as  perceptual 
output. 

30  See  Chapter  3  for  a  functional  overview  of  CogSketch. 


154 


CogSketch.  The  outputs  of  CogSketch  and  EA  NLU  are  automatically  combined  into  a  single, 
coherent  scenario  microtheory. 

For  each  multimodal  scenario  microtheory,  the  simulation  finds  instances  of  target  concepts 
(see  Figure  25)  such  as  the  two  PushingAnOb  j  ect  instances  (blue  arrows  in  the  middle  frame) 
and  the  two  MovementEvent  instances  (green  arrows  in  the  rightmost  frame)  in  Figure  26.  For 
each  instance  of  each  event,  e.g.,  the  truck  moving  in  the  rightmost  frame  of  Figure  26,  the 
system  creates  a  new  microtheory  that  describes  that  event.  The  temporal  extent  of  the  event 
(e.g.,  the  truck  moving)  is  recorded  as  the  currentstate,  and  other  statements  in  the  comic 
graph  are  recorded  in  the  event  microtheory,  relative  to  the  current  state.  For  example,  the  event 
microtheory  that  describes  the  truck’s  movement  would  contain  the  following  statements: 


(cotemporal  currentstate  (isa  move-truck-15  MovementEvent)) 

(cotemporal  currentstate  (obj ectMoving  move-truck-15  truck-15)) 

(cotemporal  currentstate  (motionPathway  move-truck-15  Right) ) 

(startsAf terEndingOf  currentstate  (touching  truck-15  child-15)) 

(startsAf terEndingOf  currentstate  (isa  push-15-0  PushingAnObject) ) 

(startsAf terEndingOf  currentstate  (providerOf Force  push-15-0  child-15)) 
(startsAf terEndingOf  currentstate  (objectActedOn  push-15-0  truck-15)) 
(startsAf terEndingOf  currentstate  (dir-Pointing  push-15-0  Right) ) 

The  system  encodes  the  truck’s  rightward  movement  within  in  the  currentstate,  and  this 
happened  right  after  (i.e.,  startsAf  terEndingOf)  the  child  touched  the  truck  and  pushed  it  to 
the  right.  These  statements  alone  provide  a  concise  account  of  cause  (e.g.,  PushingAnOb  j  ect) 
and  effect  (i.e.,  MovementEvent);  however,  these  are  not  the  only  statements  in  the  event 


microtheory.  There  are  many  other  statements  that  are  irrelevant  -  or  worse,  confusing  -  for 
learning  about  cause  and  effect.  These  include: 


155 


(temporallySubsumes  (touching  truck-15  ground-15)  currentState) 
(temporallySubsumes  (touching  car-15  truck-15)  currentState) 
(temporallySubsumes  (touching  car-15  ground-15)  currentState) 
(cotemporal  currentState  (isa  move-car-15  MovementEvent) ) 
(cotemporal  currentState  (obj ectMoving  move-car-15  car-15)) 
(cotemporal  currentState  (motionPathway  move-car-15  Right) ) 


These  irrelevant  statements  describe  the  truck  touching  the  car,  the  car  touching  the  ground,  and 
the  car  moving  simultaneously.  There  are  many  more  such  irrelevant  statements  that  are  not 
shown  here,  including  positional  relations,  relative  sizes  and  shapes  of  the  glyphs,  and  more. 

One  important  task  in  learning  from  observation  is  distinguishing  causally-relevant  information 
from  incidental  or  distracting  information.  This  is  done  automatically  with  SAGE  (see  section 
3.4.3),  and  we  address  this  challenge  next. 

So  far,  we  have  shown  how  the  system  finds  instances  of  its  target  concepts  within 
multimodal  scenario  microtheories  and  creates  a  temporally-encoded  microtheory  for  each 
instance.  The  temporal  relations  help  record  what  might  be  a  cause  and  what  might  be  an  effect 
of  each  event,  e.g.,  if  movement  starts  after  pushing,  then  movement  is  not  a  plausible  cause  of 
pushing,  but  it  is  a  plausible  effect.  Temporal  relations  also  add  significant  relational  structure  to 
the  representation  of  the  event,  which  will  aid  in  analogical  learning  with  SAGE.  The  next 
section  describes  how  SAGE  abstracts  the  central  causal  structure  of  these  scenarios  from  the 


irrelevant,  confusing  statements. 
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5.2  Creating  generalizations  of  Pushing,  Moving,  and  Blocking  with  SAGE 

The  system  maintains  a  separate  SAGE  generalization  context  for  each  of  the  event  types  it  is 
given  to  learn  (see  Figure  25).  This  simulation  creates  three  generalization  contexts:  one  for 
PushingAnOb  j  ect,  one  for  Blocking,  and  one  for  MovementEvent.  Instances  of  events  are 
added  to  the  generalization  context  for  that  event  type.  For  example,  each  temporally-encoded 
microtheory  that  describes  a  MovmentEvent  instance  is  added  to  the  MovementEvent 
generalization  context  -  and  no  other  -  to  be  automatically  generalized  using  SAGE. 


Pushing 


Figure  27:  The  three  SAGE  generalization  contexts  after  using  SAGE  to  generalize 
temporally-encoded  microtheories  about  pushing,  moving,  and  blocking. 

The  contents  of  these  generalization  contexts  during  a  simulation  are  illustrated  in  Figure  27. 
Using  a  separate  SAGE  generalization  context  for  each  concept  prevents  SAGE  from  conflating 
different  concepts  during  supervised  learning.  Within  each  context,  however,  SAGE  may  have 
multiple  generalizations.  For  instance,  within  the  PushingAnOb  j  ect  context,  there  may  be  a 
pushing  generalization  where  a  MovementEvent  follows  the  push,  and  another  pushing 
generalization  where  a  Blocking  occurs  simultaneously  with  the  push  and  no  MovementEvent 
ensues.  This  clustering  is  unsupervised,  arising  from  the  properties  of  the  data  itself. 


31  The  SAGE  generalization  algorithm  is  described  in  Chapter  3. 
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As  discussed  in  Chapter  3,  each  SAGE  generalization  contains  a  set  of  statements,  and  each 
statement  has  a  probability.  Recall  that  the  microtheories  given  to  SAGE  in  this  simulation 
contain  temporal  relations  between  a  currentstate  and  other  events  and  statements. 
Consequently,  the  generalizations  produced  by  SAGE  will  be  probabilistic  accounts  of  what 
happened  before,  during,  and  after  the  currentstate.  The  statements  with  high  probability 
are  more  characteristic  of  the  event  than  low  probability  statements. 

The  probabilistic  generalizations  produced  by  SAGE  are  not  themselves  causal  models. 
However,  they  contain  sufficient  temporal  and  statistical  information  to  create  descriptive 
qualitative  models. 

5.3  Converting  SAGE  generalizations  to  qualitative  models 

This  work  is  the  first  to  construct  qualitative  models  from  probabilistic  generalizations.  SAGE 
generalizations  are  converted  to  qualitative  models  in  two  steps:  (1)  probability  filtering  and  (2) 
causal  assignment.  Probability  filtering  involves  discarding  expressions  within  the 
generalization  that  are  below  a  given  probability  threshold.  This  retains  expressions  that  are 
more  probable  in  the  generalization  (e.g.,  that  two  objects  are  touching  during  a  push  event)  and 
discards  expressions  that  are  less  probable  (e.g.,  that  one  of  the  objects  is  a  toy  truck). 
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s  relation  to  event  e 

Roles  in  model 

s  starts  before  e  starts 

cause 

s  starts  after  e  starts 

effect 

5  subsumes  &  starts  before  e 

constraint,  cause 

s  subsumes  &  starts  with  e 

constraint,  cause,  effect 

s  and  e  are  cotemporal 

constraint,  cause,  effect 

Figure  28:  Given  a  statement  s  and  its  temporal  relationship  to  an  event  e, 
how  to  calculate  the  causal  role(s)  of  s  in  a  qualitative  model  of  e. 

After  low-probability  statements  are  filtered,  causal  assignment  determines  each  remaining 
statement’s  causal  role  with  respect  to  the  central  event.  This  is  a  simple  lookup,  using  the 
temporal  relation(s)  between  the  statement  and  the  currentstate  where  the  event  occurs.  The 
lookup  table  is  shown  in  Figure  28.  Sometimes  there  is  equal  evidence  that  a  statement  can  play 
multiple  roles,  such  as  {constraint,  cause }  or  {constraint,  effect}  or  {constraint,  cause,  effect}. 
In  these  cases,  the  system  always  chooses  constraint.  To  illustrate  why  this  is  the  case,  suppose 
that  our  generalization  describes  object  a  starting  to  touch  object  b  whenever  a  starts  to  push  b, 
but  never  before  and  never  after.  It  could  be  that: 


1 .  touching  causes  pushing, 

2.  touching  is  an  effect  of  pushing,  or 

3.  touching  is  a  necessary  constraint  for  pushing  to  occur. 

The  bias  for  adding  touching  as  a  constraint  seems  intuitive,  but  it  has  important 
implications  for  the  resulting  qualitative  model.  Recall  from  our  discussion  of  qualitative  model 
fragments  in  Chapters  3  and  4  that  constraints  limit  the  logical  applicability  of  the  model 
fragment.  Adding  touching  as  a  constraint  for  pushing  -  rather  than  a  consequence  of  pushing  - 
will  limit  the  logical  applicability  of  the  model,  all  else  being  equal.  This  means  that  the  model 
will  apply  in  fewer  situations,  so  some  events  may  go  unpredicted  or  unexplained.  However, 


limiting  the  logical  applicability  of  a  model  also  reduces  false  positives  -  that  is,  it  will  less 
frequently  make  erroneous  predictions  or  misattributions  of  causality. 
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Model  Push05 
Participants : 

?P1  Entity,  ?P2  Entity, 

?P3  PushingAnObject, 

?D1  Direction,  ?D2  Direction 


Constraints : 

(providerOf Force  ?P3  ?P1) 

(objectActedOn  ?P3  ?P2) 

(dir-Pointing  ?P3  ?D1) 

(touching  ?P1  ?P2) 

(dirBetween  ?P1  ?P2  ?D1) 

(dirBetween  ?P2  ?P1  ?D2) 

Consequences : 

(causes 

(active  ?self) 

(exists  ?M1 

(and  (isa  ?M1  MovementEvent) 

(objectMoving  ?M1  ?P2) 

(motionPathway  ?M1  ?D1))) 

Figure  29:  One  of  the  qualitative  models  learned  by  the  simulation  that  causally  relates 
pushing  and  movement.  Summaries  of  constraints  and  consequences  shown  at  right. 

After  causal  assignment  occurs,  every  high-probability  statement  in  the  SAGE 
generalization  has  been  assigned  a  role  in  a  qualitative  model.  The  entities  in  the  constraints  are 
converted  to  variables  and  become  the  participants  of  the  resulting  model.  This  produces 
encapsulated  histories  (Forbus,  1984),  which  are  descriptive  qualitative  models  that  causally  or 
temporally  relate  events  over  time.  Figure  29  shows  one  such  qualitative  model  learned  by  the 


Object  pi  touches  and  pushes 
object p2  in  direction  d/.  The 
direction  between  pi  and  pi  is  dj. 


This  causes  object p2  to  travel  in 
the  direction  di  of  the  push. 


simulation.  It  describes  a  PushingAnOb  j  ect  event  and  several  spatial  and  relational 


constraints  over  the  objects  involved,  and  a  MovementEvent  occurs  as  a  result.  The  set  of 
constraint  statements  are  directly  imported  as  constraints  of  the  model,  but  causes  and  effects  are 


listed  in  the  consequences  of  the  model.  For  instance,  in  Figure  29,  the  MovementEvent  ?ml  is 
an  effect  of  the  PushingAnObject  ?pl,  so  the  following  statement  is  a  consequence: 
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(causes 

(active  ?self) 

(exists  ?M1 

(and  (isa  ?M1  MovementEvent) 
(objectMoving  ?M1  ?P2) 
(motionPathway  ?M1  ?D1 ) ) ) 


Recall  that  the  constraints  and  participants  of  a  qualitative  model  are  logical  antecedents  to 
the  construction  and  activation  of  an  instance  of  the  model  (e.g.,  (active  ?self )  )  over  those 
participants.  For  example,  when  an  instance  Push05-lnstance  of  model  Push05  is  created 
and  activated  with  ?p2  bound  to  pushed-ent  and  ?dl  bound  to  pushed-dir,  the  following 
statements  will  be  inferred  in  the  scenario: 


(active  Push05-Instance) 

(causes 

(active  Push05-Instance) 

(exists  ?M1 

(and  (isa  ?M1  MovementEvent) 

(objectMoving  ?M1  pushed-ent) 
(motionPathway  ?M1  pushed-dir)  )  ) 


The  causal  relation  therefore  indicates  that  the  activation  of  the  model  fragment  instance  will 
cause  a  new  MovementEvent  with  the  pushed  object  moving  in  the  direction  of  the  push.  This 
means  that  if  this  model  is  instantiated  in  a  scenario,  the  agent  should  predict  movement  to  occur 
as  an  effect,  either  in  the  present  state  or  in  a  subsequent  state. 

Suppose,  contrary  to  Figure  29,  that  ?ml  is  actually  a  cause  in  the  model  rather  than  an 
effect.  In  this  case,  the  following  statement  would  be  a  consequence  of  the  model: 

(causes 

(exists  ?M1 

(and  (isa  ?M1  MovementEvent) 

(objectMoving  ?M1  ?P1) 

(motionPathway  ?M1  ?D1)) 

(active  ?self) ) 
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This  states  that  if  the  agent  must  explain  what  caused  the  state  of  events  represented  in  the 
constraints  of  the  model,  a  MovementEvent  ?ml  is  the  cause.  This  means  that  if  the  model  is 
instantiated  in  a  scenario,  the  agent  should  predict  some  event  ?ml  to  also  occur  in  the  present 
state  or  to  have  occurred  in  the  immediately  preceding  state.  The  presence  of  this  causal  factor 
within  the  consequences  block  of  the  model  may  seem  counterintuitive,  but  we  must  not  conflate 
logical  consequences  (e.g.,  as  in  model  fonnulation)  with  causal  consequences  (i.e.,  effects). 

For  this  simulation,  we  gave  the  system  17  multimodal  comic  graphs  as  training  data.  These 
comic  graphs  described  50  instances  of  events,  all  either  PushingAnObject,  Blocking,  or 
MovementEvent.  This  resulted  in  50  temporally-encoded  microtheories  describing  each  event 
instance,  which  resulted  in  ten  SAGE  generalizations  (shown  in  Figure  27)  after  analogical 
learning.  These  were  transformed  into  descriptive  qualitative  models  of  pushing  (e.g.,  Figure 
29),  moving,  and  blocking,  using  the  processes  described  above. 

To  summarize,  SAGE  generalizations  are  probabilistic  abstractions  of  observations,  but  they 
are  not  causal  models  in  themselves.  These  are  converted  into  qualitative  models  in  two  steps: 
(1)  filtering  is  used  to  select  the  high  probability  statements,  and  (2)  using  the  temporal  relations 
of  these  statements  to  detennine  their  role  in  a  qualitative  model.  We  next  discuss  how  these 
qualitative  models  compare  to  the  mental  models  of  students  on  two  problem-solving  tasks. 

5.4  Comparing  the  system’s  models  of  motion  to  students’  mental  models 

We  cannot  directly  observe  students’  mental  models  -  if  we  could,  there  would  be  little  question 
of  how  they  are  represented  and  how  they  change.  Consequently,  we  can  only  compare  the 
system’s  models  to  students’  mental  models  by  comparing  the  predictions  and  explanations  they 
generate  during  problem-solving  tasks.  We  chose  two  problems  from  the  learning  science 
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literature:  one  from  Brown  (1994),  and  one  from  the  Force  Concept  Inventory  (Hestenes  et  al., 
1992).  We  discuss  each  problem,  the  results  from  students,  and  the  results  from  our  simulation. 

Brown  (1994)  showed  a  group  of  73  high-school  students  a  book  resting  on  the  surface  of  a 
table,  and  asked  them  whether  the  table  exerts  a  force  on  the  book.  Here  are  the  most  popular 
answers  provided  by  the  students: 

1 .  Yes.  The  table  must  exert  an  upward  force  the  book  to  counteract  the  downward 
force  of  the  book  (33  students). 

2.  No.  Gravity  pushes  the  book  flat,  and  the  book  exerts  a  force  on  the  table.  The  table 
merely  supports  the  book  (19  students). 

3.  No.  The  table  requires  energy  to  push  (7  students). 

4.  No.  The  table  is  not  pushing  or  pulling  (5  students). 

5.  No.  The  table  is  just  blocking  the  book  (4  students). 

6.  No.  The  book  would  move  upward  if  the  table  exerted  a  force  (4  students). 

Thirty-three  students  correctly  explained  that  the  table  pushes  up  against  the  book.  The 
forty-student  majority  denied  that  the  table  exerted  a  force  on  the  book,  but  for  five  different 
reasons  (answers  2-6).  Some  students  gave  more  than  one  incorrect  explanation.  For  our  present 
purposes,  we  are  interested  in  modeling  the  incorrect  answers,  because  these  are  the  intuitive 
models  of  dynamics  that  students  hold  prior  to  conceptual  change.  If  the  simulation’s  qualitative 
models  are  comparable  in  content  to  student  mental  models,  then  the  simulation  will  make  the 


same  set  of  mistakes  as  students. 
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Figure  30:  The  sketch  for  the  problem-solving  task  from  Brown  (1994). 

Our  simulation  was  given  a  sketch  of  the  same  problem,  illustrated  in  Figure  30.  The 

system  had  a  domain  theory  containing  the  qualitative  models  learned  via  SAGE  and  the  facts 
that  the  omnipresent  force  of  gravity  pushes  all  things  (i.e.,  instances  of  Entity)  downward,  but 
is  not  an  Entity  itself.  Given  the  sketched  scenario  illustrated  in  Figure  30,  we  queried  the 
system  to  (1)  find  all  instances  of  PushingAnOb  j  ect  that  are  consistent  with  the  scenario  and 
then  (2)  explain  why  a  PushingAnOb  j  ect  event  between  the  table  and  the  book  in  the  upward 
direction  must  or  must  not  exist. 

To  complete  the  first  task,  the  system  uses  model-based  inference  (described  in  Chapter  3) 
to  instantiate  all  qualitative  models  whose  participants  and  constraints  are  satisfied  in  the 
scenario.  Specifically,  the  system  begins  by  inferring  that  gravity  pushes  all  objects  downward 
and  then  instantiates  its  qualitative  models  to  create  causal  explanations  and  predictions  about 
these  PushingAnOb  j  ect  events.  All  of  these  events  were  explained  by  a  model  that  relates 
PushingAnOb  j  ect  and  Blocking.  This  model  was  used  to  infer  two  blocking  events: 

1 .  Gravity  pushes  down  on  the  book  which  pushes  down  on  the  table,  and  the  table 
blocks  the  book. 

2.  Gravity  pushes  down  on  the  table  which  pushes  down  on  the  ground,  and  the  table 


blocks  the  ground. 
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The  first  inference  is  similar  to  student  answers  (2)  and  (5)  above,  used  a  total  of  23  times  in 
Brown’s  (1994)  experiment,  except  the  simulation  does  not  mention  the  concept  of  support  in 
student  answer  (2).  This  explanation  given  by  the  students  and  the  system  does  not  directly 
confirm  or  deny  that  the  table  pushes  the  book,  but  it  does  describe  the  causal  relationship 
between  pushing  and  blocking  within  the  scenario. 

The  system’s  second  task  is  to  explain  why  the  table  must  or  must  not  push  the  book,  if  there 
is  sufficient  evidence  present.  This  involves  (1)  assuming  that  the  PushingAnOb  j  ect  does  in 
fact  occur  in  the  scenario,  (2)  instantiating  qualitative  models  as  in  the  first  task,  and  then  (3) 
searching  for  contradictions  that  arise  as  a  result.  Contradictions  are  found  by  querying  for 
inconsistent  patterns,  e.g.,  a  statement  and  its  negation  are  simultaneously  believed  in  the  same 
state  or  an  observable  event  (e.g.,  MovementEvent)  is  inferred  but  not  observed  in  the  scenario. 
If  a  contradiction  is  found,  this  constitutes  an  indirect  proof  that  the  table  does  not  push  the  book. 
The  system  uses  the  qualitative  model  in  Figure  29  to  achieve  this.  This  results  in  the  following 
inference: 

3.  The  table  pushing  the  book  would  result  in  the  book  moving  upward. 

Since  movement  is  not  observed,  this  is  contradictory. 

This  inference  is  similar  to  student  answer  (5),  used  by  four  students  in  Brown  (1994). 
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In  one  multiple  choice  question  from  the  Force  Concept  Inventory  (Hestenes  et  ah,  1992), 
students  are  shown  a  top-down  sketch  of  a  puck  sliding  to  the  right  along  a  frictionless  surface, 
and  asked  which  path  it  would  traverse  if  given  an  instantaneous  kick  forward.  The  problem  and 
the  proportion  of  student  responses  are  shown  in  Figure  31,  left.  We  sketched  the  problem  using 
CogSketch  as  a  comic  graph  with  a  fork  in  the  state  space  (Figure  31,  right),  such  that  after  the 
kick,  the  puck  could  traverse  one  of  five  different  paths  (a-e).  The  simulation  decides  which  path 
the  puck  will  traverse  by  exhaustively  instantiating  models  qualitative  models  in  the  pre-fork 
state  where  the  foot  kicks  the  puck.  The  answer  (a-e)  that  matches  the  simulation’s  prediction  is 
chosen. 
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Figure  31:  Problem  from  the  Force  Concept  Inventory,  and  student/simulation 
responses  (left).  Sketch  of  the  same  problem  using  CogSketch  (right). 

The  only  model  that  can  be  instantiated  (i.e.,  its  participants  and  conditions  are  satisfied)  in 
this  scenario  is  the  model  in  Figure  29.  The  causal  consequence  of  this  instance  is  that  the  puck 
(bound  to  slot  ?p2)  is  the  subject  of  a  MovementEvent  in  the  direction  ?dl  (bound  to  Up).  This 
behavior  is  described  in  choice  (a)  in  the  scenario,  so  this  is  the  choice  made  by  the  system. 
Choice  (a)  was  the  most  popular  incorrect  answer  of  the  students  tested  by  Hestenes  et  al.  (1992), 
which  suggests  that  it  is  a  common  misconception. 


166 


5.5  Discussion 

This  simulation  induces  descriptive  qualitative  models  from  observations  using  analogical 
generalization.  When  the  qualitative  models  were  used  to  solve  two  problems  from  the  learning 
science  literature,  they  produced  some  of  the  same  incorrect  explanations  and  predictions  as 
novice  students. 

The  fact  that  our  system  uses  qualitative  models  to  simulate  some  of  the  predictions  and 
explanations  of  novices  supports  the  claim  that  qualitative  models  provide  a  consistent 
computational  account  of  human  mental  models.  Since  these  models  were  induced  from 
sketched  observations,  this  simulation  also  supports  the  claim  that  analogical  generalization  is 
capable  of  inducing  qualitative  models.  Importantly,  the  qualitative  models  learned  by  this 
simulation  are  mechanism-free,  since  they  only  describe  causal  relationships  between  discrete 
events.  Since  novices  and  experts  alike  are  capable  of  explaining  mechanisms  of  change  (e.g., 
physical  processes  and  influences  between  quantities),  more  evidence  is  needed  to  support  the 
first  claim. 

The  match  between  our  system  and  novice  students  rely  upon  the  psychological  assumptions 
of  our  model  discussed  in  Chapter  1  and  the  perceptual  assumptions  about  comic  graphs 
discussed  in  Chapter  3.  To  summarize:  the  training  data  of  this  simulation  are  sparser  than  the 
observations  human  encounter  in  the  world  because  they  contain  only  causally-relevant  entities 
(e.g.,  there  are  no  birds  flying  overhead)  and  they  are  already  segmented  into  meaningful 
qualitative  states.  These  simplifications  reduce  the  complexity  of  learning  and  pennit  the  system 
to  leam  much  faster  than  people.  Further,  since  the  system  has  complete  information  about  each 
state,  each  event  (e.g.,  instance  of  PushingAnObject)  is  always  observed  in  conjunction  with 
its  constraints  (e.g.,  touching  statements).  This  means  that  there  is  a  perfect  correlation  for 
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events  and  their  observed  constraints  in  this  simulation,  but  this  information  is  not  always 
available  to  people. 

We  have  sketched  a  simplified  account  of  how  people  might  develop  mental  models  from 
observing  the  world:  abstracting  common  structure  and  inferring  causal  relations  based  on 
temporal  relations.  Whether  the  system  can  learn  scientifically-accurate,  Newtonian  models  via 
observation  is  an  empirical  question.  It  is  not  a  question  of  knowledge  representation,  since 
qualitative  models  can  represent  scientifically-accurate  models  of  dynamics  (see  Forbus,  1984); 
rather,  it  is  a  question  of  the  inductive  learning  process.  And  since  the  vast  majority  of  students 
only  develop  a  Newtonian  understanding  of  the  world  after  formal  instruction,  we  should  not 
expect  an  accurate  model  of  human  learning  to  induce  Newtonian  dynamics  from  observation 
alone. 

This  first  simulation  only  utilizes  the  explanation-based  network  insomuch  as  it  generates 
qualitative  models  to  populate  the  domain  theory,  and  then  uses  these  models  to  make  inferences 
during  problem  solving.  The  next  simulations  address  how  qualitative  models  change,  provided 


instruction  and  interaction. 
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Chapter  6:  Revising  mechanism-based  models  of  the  seasons 

Thus  far,  we  have  simulated  how  mental  models  might  be  induced  from  observations.  However, 
this  does  not  account  for  how  mental  models  change,  or  how  knowledge  is  incorporated  via 
communication  or  instruction.  The  simulation  in  this  chapter  addresses  these  two  topics.  We 
model  middle-school  students  in  a  study  by  Sherin  et  al.  (2012)  who  construct  -  and  in  some 
cases,  revise  -  explanations  of  why  the  seasons  change,  during  a  clinical  interview. 

This  simulation  and  the  simulations  in  Chapters  7-8  assume  that  when  a  student  revises  her 
mechanism-based  explanation  of  a  phenomenon,  such  as  seasonal  change,  she  has  also  revised 
her  underlying  mental  model  of  that  phenomenon.  Recall  that  in  Chapter  1 ,  we  assumed  that 
mental  models  are  used  to  construct  explanations  of  phenomena.  If  a  student  revises  her 
explanation,  she  has  constructed  a  new  explanation  that  she  prefers  over  her  previous 
explanation.  More  specifically,  she  has  recombined  her  knowledge  into  a  different  mental 
model,  and  its  structure,  assumptions,  and  inferences  are  preferable  to  that  of  the  fonner  mental 
model  in  the  context  of  the  phenomenon  explained.  So,  explanation  revision  is  a  good  indicator 
of  mental  model  revision. 

This  simulation  provides  support  for  claims  1  and  3  of  this  dissertation: 

Claim  1 :  Compositional  qualitative  models  provide  a  consistent  computational  account  of 
human  mental  models. 

Claim  3:  Human  mental  model  transformation  and  category  revision  can  both  be  modeled 
by  iteratively  (1)  constructing  explanations  and  (2)  using  meta-level  reasoning  to  select 
among  competing  explanations  and  revise  domain  knowledge. 
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Since  we  are  simulating  how  students  reason  about  seasonal  change,  we  represent  student 
domain  knowledge  with  qualitative  model  fragments  to  support  claim  1.  We  use  the 
explanation-based  model  of  conceptual  change  described  in  Chapter  4  to  simulate  mental  model 
transformation  and  support  claim  3.  We  begin  by  discussing  the  learning  science  study  with 
students,  and  then  we  discuss  our  simulation  setup.32 

6.1  How  commonsense  explanations  (and  seasons)  change 

The  experimenters  in  Sherin  et  al.  (2012)  interviewed  35  middle-school  students  regarding  the 
changing  of  the  seasons  to  investigate  how  students  use  commonsense  science  knowledge.  Each 
interview  began  with  the  question  “Why  is  it  warmer  in  the  summer  and  colder  in  the  winter?” 
followed  by  additional  questions  and  sketching  for  clarification.  If  the  interviewee’s  initial 
explanation  of  seasonal  change  did  not  account  for  different  parts  of  the  earth  experiencing 
different  seasons  simultaneously,  the  interviewer  asked,  “Have  you  heard  that  when  it’s  summer 
[in  Chicago],  it  is  winter  in  Australia?”  This  additional  information,  whether  familiar  or  not  to 
the  student,  often  lead  the  student  to  identify  an  inconsistency  in  their  explanation  and 
reformulate  an  answer  to  the  initial  question  by  recombining  existing  beliefs. 

The  interview  transcript  from  the  student  “Angela”  is  listed  in  the  appendix,  courtesy  of 
Sherin  et  al.  Angela  begins  by  explaining  that  the  earth  is  closer  to  the  sun  in  the  summer  than  in 
the  winter,  which  we  call  the  near-far  explanation.  The  seasons  change  as  the  earth  approaches 
and  retreats  from  the  sun  throughout  its  orbit  of  the  sun.  This  is  illustrated  by  a  student  sketch  in 
Figure  32a.  When  the  interviewer  asks  Angela  if  she  has  heard  that  Australia  experiences  its 

32  This  builds  upon  the  simulation  published  in  Friedman,  Forbus,  &  Sherin  (2011) 
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winter  during  Chicago’s  summer,  and  whether  this  is  a  problem  for  her  explanation,  Angela  sees 
that  her  explanation  is  problematic.  She  eventually  changes  her  answer  by  explaining  that  the 
spin  of  the  earth  changes  the  seasons:  the  parts  of  the  earth  that  face  the  sun  experience  their 
summer,  while  the  parts  that  face  away  experience  winter.  We  call  this  the  facing  explanation. 
Other  students  used  the  near-far  explanation  and  the  facing  explanation,  and  many  students  drew 
upon  idiosyncratic  knowledge,  e.g.,  that  they  had  seen  a  picture  of  a  sunny  day  in  Antarctica, 
which  influenced  their  explanations. 


(a) 


Northern  Hemisphere 
Summer 


Northern  Hemisphere 
Winter 


Figure  32:  Two  diagrams  explaining  seasonal  change,  courtesy  of  Sherin  et  al.  (2012). 

(a)  Sketch  from  a  novice  student,  explaining  that  the  earth  is  closer  to  the  sun  in  the 
summer  than  in  the  winter,  (b)  Scientific  explanation  involving  tilt  and  insolation. 

The  interviewer  did  not  relate  the  correct  scientific  explanation  during  the  course  of  the 

interview,  so  the  students  transitioned  between  various  intuitive  explanations.  The  scientifically 

accurate  explanation  of  the  seasons  is  that  the  earth’s  axis  of  rotation  is  tilted  relative  to  its 

orbital  plane,  so  it  always  points  in  the  same  direction  throughout  its  orbit  around  the  sun.  When 

the  northern  hemisphere  is  inclined  toward  the  sun,  it  receives  more  direct  sunlight  than  when 

tilted  away,  which  results  in  warmer  and  cooler  temperatures,  respectively.  This  is  illustrated  in 
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Figure  32b.  While  many  students  mentioned  that  the  earth’s  axis  is  tilted,  fewer  used  this  fact  in 
an  explanation,  and  none  of  these  were  scientifically  accurate. 

Sherin  et  al.  created  a  master  listing  of  conceptual  knowledge  used  by  the  students  during 
the  interviews,  including  propositional  beliefs,  general  schemas,  and  fragmentary  mental  models. 
Five  of  the  students  from  the  study  were  characterized  with  enough  precision  for  us  to  encode 
their  beliefs  and  mental  models  using  propositions  and  qualitative  model  fragments,  respectively. 

The  rest  of  this  chapter  describes  a  simulation  of  how  these  five  students  construct 
explanations  of  dynamic  systems  from  fragmentary  domain  knowledge  and  how  these 
explanations  are  revised  after  new  information  renders  them  inconsistent.  Each  trial  of  the 
simulation  corresponds  to  a  subset  of  these  students,  so  the  starting  domain  knowledge  varies 
across  the  trials,  but  the  rest  of  the  simulation  is  identical.  We  use  the  Angela  trial  to  describe 
the  workings  of  the  simulation.  As  mentioned  above,  the  students  interviewed  were  not  given 
the  correct  explanation,  but  we  include  an  additional  simulation  trial  that  has  access  to  the 
knowledge  required  for  the  correct  explanation.  This  demonstrates  that  the  system  can  construct 
the  correct  explanation  when  provided  correct  domain  knowledge. 
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Our  simulation  of  the  students  in  Sherin  et  al.  uses  the  conceptual  change  model  described  in 
Chapter  4  including:  (1)  the  explanation-based  network;  (2)  qualitative  model  fragments;  (3)  the 
abductive  model  fonnulation  algorithm;  and  (4)  cost  functions  to  compute  preferences  over 
explanations.  We  next  describe  how  these  processes  construct  and  revise  qualitative  models  and 
explanations. 


Model Fragment  Astronomical Heating 
Participants : 

Theater  HeatSource  (providerOf) 
Theated  AstronomicalBody  (consumerOf) 
Constraints : 

(spatiallyDisjoint  Theater  Theated) 
Conditions:  nil 
Consequences : 

(qprop-  (Temp  Theated)  (Dist  Theater 
(qprop  (Temp  Theated)  (Temp  Theater) ) 


When  an  astronomical  body 
heated  and  a  heat  source  heater 
are  spatially  separated,  the 
temperature  of  heated :  (1) 
increases  with  the  temperature  of 
heater  and  (2)  decreases  as  the 
Theated) )  distance  between  them  increases. 


Model Fragment  Approaching- Per iodic Path 
Participants : 

Tmover  AstronomicalBody  (obj Translating) 
Tstatic  AstronomicalBody  (to-Generic) 

Tpath  Path-Cyclic  (alongPath) 

Tmovement  Translation-Periodic  (translation) 
Tnear-pt  ProximalPoint  (toLocation) 

Tfar-pt  DistalPoint  ( f romLocation) 
Constraints : 

(spatiallyDisjoint  Tmover  Tstatic) 

(not  (centeredOn  Tpath  Tstatic)) 

(ob j ectTranslating  Tmovement  Tmover) 
(alongPath  Tmovement  Tpath) 

(on-Physical  Tfar-pt  Tpath) 

(on-Physical  Tnear-pt  Tpath) 

(to-Generic  Tfar-pt  Tstatic) 

(to-Generic  Tnear-pt  Tstatic) 

Conditions : 

(active  Tmovement) 

(betweenOnPath  Tmover  Tfar-pt  Tnear-pt) 
Consequences : 

(i-  (Dist  Tstatic  Tmover)  (Rate  Tself) ) 


An  object  mover  travels  on  a 
cyclic  path  path  relative  to 
another  object  static  where  path  is 
not  centered  on  static.  If  mover  is 
approaching  -  but  not  at  -  the 
closest  point  on  path  to  static, 
then  there  is  a  rate  of  approach 
which  decreases  the  distance  from 
mover  to  static. 


Figure  33:  AstronomicalHeating  (top)  and  Approaching-PeriodicPath  (bottom) 
model  fragments  used  in  the  simulation.  English  interpretations  of  both  model 

fragments  included  at  right. 
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6.2  Simulating  how  students  construct  and  revise  explanations 

The  students  interviewed  by  Sherin  et  al.  (2012)  performed  two  tasks  that  are  especially  relevant 
to  conceptual  change. 

1 .  Explain  existing  beliefs  (e.g.,  Chicago  and  Australia  are  wanner  in  their  summers  than 
they  are  in  their  winters)  when  prompted. 

2.  Incorporate  new,  credible,  information  (e.g.,  Chicago’s  summer  coincides  with 
Australia’s  winter)  and  change  explanations  as  needed  to  improve  coherence. 

These  are  the  tasks  we  are  interested  in  simulating  in  this  chapter.  We  model  the  first  task  by  ( 1) 
using  the  abductive  model  fonnulation  algorithm  described  in  Chapter  4  to  construct 
explanations  and  then  (2)  using  the  cost  function  to  detennine  which  explanation  is  preferred. 
We  model  the  second  task  by  (1)  adding  new  domain  knowledge,  (2)  searching  for 
contradictions,  and  then  (3)  using  the  cost  reduction  procedure  described  in  Chapter  4 
(i restructure-around-artifact  in  Figure  23)  to  find  more  suitable  sets  of  explanations  for  existing 
beliefs,  when  possible. 

For  the  Angela  trial,  the  system  starts  with  a  set  of  model  fragments  for  both  the  near- far 
explanation  and  the  facing  explanation,  since  Angela  constructed  both  of  these  explanations 
during  the  interview  without  learning  these  concepts  from  the  interviewer.  Two  of  these  model 
fragments  and  their  simplified  English  translations  are  shown  in  Figure  33.  The  system  also 
contains  propositional  beliefs,  such  as  the  belief  that  Chicago  is  wanner  in  its  summer  than  in  its 
winter.  This  belief  is  represented  by  the  following  statement: 
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(greaterThan  (M  (Temp  Chicago)  ChiSummer) 

(M  (Temp  Chicago)  ChiWinter) ) 

The  M  function  in  this  statement  take  two  arguments  -  a  quantity  tenn  such  as  ( Temp  Chicago ) 
and  a  state  such  as  ChiSummer  -  and  denotes  the  measurement  of  the  quantity  within  the  state. 
This  statement  therefore  translates  to  “the  temperature  of  Chicago  is  greater  in  its  summer  than 
in  its  winter.”  ChiSummer  and  ChiWinter  are  the  subjects  of  other  beliefs  in  the  system’s 
domain  knowledge  beliefs  such  as: 

(isa  ChiWinter  CalendarSeason) 

(isa  ChiAutumn  CalendarSeason) 

(isa  ChiSummer  CalendarSeason) 

(isa  ChiSpring  CalendarSeason) 

(contiguousAfter  ChiWinter  ChiAutumn) 

(contiguousAfter  ChiAutumn  ChiSummer) 

(contiguousAfter  ChiSummer  ChiSpring) 

(contiguousAfter  ChiSpring  ChiWinter) 

These  beliefs,  including  the  greaterThan  statement,  are  all  present  in  the  system’s  adopted 
domain  knowledge  microtheory  H3)a  at  the  beginning  of  the  simulation  trial,  but  they  are  not  yet 
used  within  any  explanations. 

6.2.1  Explaining  Chicago’s  seasons 

At  the  beginning  of  our  Angela  trial,  we  query  the  system  for  an  explanation  of  why  it  is  warmer 
in  Chicago’s  summer  than  in  its  winter.  This  is  done  by  calling  justify -explanandum  in  (Figure 
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34;  but  also  see  Chapter  4)  with  the  following  inputs:  the  greaterThan  statement  as  the 
explanandum;  the  model  fragments  in  Da  as  the  domain  theory;  and  the  adopted  domain 
knowledge  micro  theory  Da  as  the  scenario.  The  justify-explanandum  procedure  uses  the 
abductive  model  formulation  procedure  (Chapter  4,  Figure  17)  to  instantiate  model  fragments 
that  help  justify  the  explanandum.  These  procedures  build  the  network  structure  for  explanation 
x i  of  Chicago’s  seasons  shown  in  Figure  35.  We  step  through  the  procedures  in  Figure  34  in 
greater  detail  to  show  how  the  explanation  x/  in  Figure  35  is  constructed.  Chapter  4  provided  a 
detailed  example  of  abductive  model  formulation,  so  we  concentrate  here  on  the  justify-ordinal- 
relation  and  justify-quantity-change  procedures  that  invoke  the  abductive  model  formulation 
procedure.  We  assume  that  an  explanandum  is  one  of  the  following:  (1)  a  symbol  that  refers  to  a 
process  instance;  (2)  an  ordinal  relation  represented  by  a  greaterThan  statement;  or  (3)  a 
quantity  change  represented  by  an  increasing  or  decreasing  statement.  This  means  that  our 
system  does  not  justify  the  existence  of  physical  objects,  since  we  are  primarily  concerned  with 
explaining  physical  phenomena  and  events.  Also,  our  system  does  not  justify  equalTo 
relations,  since  -  without  information  to  the  contrary  -  these  can  be  explained  by  the  absence  of 
direct  and  indirect  influences.  Any  lessThan  relation  can  be  converted  into  a  greaterThan 
relation  by  reversing  its  two  arguments. 
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When  justify-explanandum  is  called  on  the  belief  that  Chicago  is  warmer  in  its  summer  than  its 
winter,  the  system  detects  that  the  explanandum  is  an  ordinal  relation,  and  invokes  justify-ordinal- 
relation.  This  procedure  binds  q  to  (Temp  Chicago)  ,  Sj  to  ChiSummer,  and  s2  to  ChiWinter.  It 
then  queries  to  determine  whether  (1)  ChiWinter  is  after  ChiSummer  and  whether  (2)  ChiSummer  is 


Front-ends  to  abductive  model  formulation 


procedure  justify-explanandum(ex planandum  772,  domain  D,  scenario  5) 

if  772  is  a  symbol  and  772  is  an  instance  of  collection  c  such  that  (isa  c  ModelFragment) : 
justify-process{m,  D,  5) 

else  if  772  unifies  with  (greaterThan  ?x  ?y)  : 

justify-ordinal-relation(m,  D,  S) 

else  if  772  unifies  with  (increasing  ?x)  orwith  (decreasing  ?x)  : 

let  q,d  =  quantity-of-change(7?2),  direction-of-change(772) 
justify-quantity-change(q,  d,  D,  S) 

procedure  justify-ordinal-relation  (ordinal  relation  m,  domain  D,  scenario  5) 

//  772  is  of  the  form  (greaterThan  (M  q  si)  (M  q  s2)) 
let  q,  sj,  s2  =  quantity-of(772),  state-l-of(772),  state-2-of(777) 
if  query  Sfor  (after  s2  s2)  then: 

justify-quantity-change(q,  i -,D,  S) 
if  query  S for  (after  st  s2)  then: 

justify-quantity-change(q,  i+,  D,  S) 

procedure  justify-quantity-change  (quantity  q,  direction  d,  domain  D,  scenario  5) 

//  Find  direct  and  indirect  influencaes  of  q 
instantiate-fragments-with-consequence(  (qprop  q  ?x) ,  D,  S) 
instantiate-fragments-\vitli-conseqiience(  (qprop-  q  ?x) ,  D,  S) 
instantiate-fragments-mth-consequence(  (d  q  ?x) ,  D,  S) 

let  I/  =  query  S  for  indirect  influences  on  q.  II  results  are  in  form  (qprop/qprop-  q  ?x) 

for  each  i  in 

let  di  =  direction-of-influence(2)  //  qprop  or  qprop- 
let  qt  =  influencing-quantity(7) 

let  dc  =  d 

if  di  =  qprop- then: 

set  dc  =  opposite(7/) 
justify-quantity-change(qi,  dc,  D,  5) 


Figure  34:  Pseudo-code  for  constructing  explanations  about  ordinal  relations  and  quantity 

changes,  from  Chapter  4. 

after  ChiWinter.  Since  both  are  true,  the  beliefs  f  19.20  in  Figure  35  are  encoded  to  justify  the 
explanandum.  Now  the  system  must  justify  how  (Temp  Chicago)  decreases  between  ChiSummer 
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and  chiwinter  and  how  it  increases  between  chiwinter  and  chiSummer.  This  is  achieved  with 
the  following  two  procedure  invocations: 

justify-quantity-change(  (Temp  Chicago)  ,  i-,  D,  S) 

justify-quantity-change(  ( Temp  Chicago )  ,  i +,D,  S) 

Notice  that  these  invocations  make  no  mention  of  Chiwinter  and  ChiSummer.  This  is 
because  the  system  is  building  a  model  of  the  mechanisms  by  which  the  temperature  of  Chicago 
might  increase  and  decrease.  These  beliefs  and  causal  mechanisms  are  explicitly  quantified  in 
specific  states  using  temporal  quantifiers  represented  as  white  triangles  in  Figure  35.  We  discuss 
temporal  quantifiers  before  continuing  our  walk-through. 

Consider  the  temporal  quantifier  that  justifies  {20  with  f18  in  Figure  35.  This  states  that  we 
can  believe f20  (i.e.,  (holdsln  (Interval  ChiSummer  Chiwinter)  (decreasing 
(Temp  Chicago)  )  )  )  so  long  as  the  belief/}#  (i.e.,  (decreasing  (Temp  Chicago )))  and 
all  beliefs  justifying// hold  within  the  state  (Interval  ChiSummer  Chiwinter).  This 
compresses  the  explanation  structure:  without  these  temporal  quantifiers,  we  would  have  to  store 
each  belief  b  left  of //  as  (holdsln  (Interval  ChiSummer  Chiwinter)  b)  .  The 
temporal  quantifiers  in  the  network  can  be  used  to  decompress  the  explanation  into  this  format 
without  any  loss  of  information,  but  we  can  perform  temporal  reasoning  without  decompressing. 

The  invocation  of justify-quantity-change ( (Temp  Chicago) ,  i-,  D,  S)  begins  by  abductively 
instantiating  all  model  fragments  in  the  domain  theory  that  contain  a  consequence  that  unifies 
with  (qprop  (Temp  Chicago)  ?x)  ,  (qprop-  (Temp  Chicago)  ?x),or  (i-  (Temp 
Chicago )  ?x ) .  This  uses  the  abductive  model  formulation  algorithm  described  in  Chapter  4. 
The  result  is  the  instantiation  of  qualitative  models  that  can  contain  indirect  (i.e.,  qprop  and 
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qprop-)  and  direct  (i.e.,  i-)  influences  on  Chicago’s  temperature  to  help  explain  why  it 
decreases.  After  these  invocations,  the  procedure  justify-quantity-change  finds  these  and  other 
influences  which  explain  Chicago’s  temperature  decreasing  within  the  scenario  model.  In  our 
example,  it  finds  the  qualitative  proportionality  (qprop  (Temp  Chicago)  (Temp 
PlanetEarth)  )  represented  as //(5  in  Figure  35,  which  states  that  the  temperature  of  Chicago 
will  decrease  if  the  temperature  of  the  earth  decreases.  Next  the  system  attempts  to  justify  the 
earth  decreasing  in  temperature  (decreasing  (Temp  PlanetEarth)  ),  plotted  as//*  in 
Figure  35.  This  results  in  the  recursive  invocation: 

justify-quantity-change(  (Temp  PlanetEarth) ,  i-,  D,  S) 

In  this  recursive  invocation,  the  system  finds  the  model  fragment  AstronomicalHeating 
(shown  in  Figure  33)  with  the  following  consequences: 

(qprop-  (Temp  Theated)  (Dist  Theater  Pheated) ) 

(qprop  (Temp  Pheated)  (Temp  Theater) ) 

When  the  system  binds  Pheated  to  PlanetEarth  and  invokes  abductive  model  fonnulation,  it 
instantiates  and  activates  an  instance  of  AstronomicalHeating  with  produces  the  statements 
f9.11  in  Figure  35,  including: 

(qprop-  (Temp  PlanetEarth)  (Dist  TheSun  PlanetEarth) ) 


(qprop  (Temp  PlanetEarth)  (Temp  TheSun) ) 
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Consequently,  when  the  procedure  next  searches  for  indirect  influences  of  (Temp 
PlanetEarth) ,  it  determines  that  it  can  justify  the  earth’s  cooling  with  an  increase  of  (Dist 
TheSun  PlanetEarth)  or  a  decrease  of  (Temp  TheSun) .  This  makes  another  recursive 
invocation  of  justify-quantity-change  to  justify  an  increase  of  (Dist  TheSun 
PlanetEarth)  .  This  ultimately  creates  a  Retreating-Periodic  instance  whose  rate  increases 
the  earth’s  distance  to  the  sun  (statement //?  in  Figure  35)  during  a  segment  of  the  earth’s  orbit 
around  the  sun. 


Legend 

fo  (isa  earthPath  EllipticalPath) 

/,  (spatiallyDisjoint  earthPath  TheSun) 

f2  (isa  TheSun  AstronomicalBody ) 

mo  (isa  ProximalPoint  ModelFragment) 

mi  (isa  DistalPoint  ModelFragment) 
m2  (isa  Approaching-Periodic  ModelFragment) 
oij  (isa  AstronomicalHeating  ModelFragment) 
m4  (isa  Retreating-Periodic  ModelFragment) 
f3  (isa  TheSun  HeatSource) 
f4  (spatiallyDisjoint  TheSun  PlanetEarth) 
f5  (isa  APP-inst  Approaching-PeriodicPath) 

f6  (isa  AH-inst  AstronomicalHeating) 

/7  (isa  RPP-inst  Retreating-PeriodicPath) 

fs  (i-  (Dist  TheSun  PlanetEarth) 

(Rate  APP-inst) ) 


f9  (active  AH-inst) 

fI0  (qprop-  (Temp  PlanetEarth) 

(Dist  TheSun  PlanetEarth) ) 
fn  (qprop  (Temp  PlanetEarth) 

(Temp  TheSun) ) 

fi2  (i+  (Dist  TheSun  PlanetEarth) 

(Rate  RPP-inst) ) 

fi3  (increasing  (Temp  PlanetEarth)) 

fI4  (decreasing  (Temp  PlanetEarth)  ) 

f5  (qprop  (Temp  Australia)  (Temp  PlanetEarth)) 

f46  (qprop  (Temp  Chicago)  (Temp  PlanetEarth)  ) 

fn  (increasing  (Temp  Chicago)) 

fis  (decreasing  (Temp  Chicago)) 

//9  (holdsln  (Interval  ChiWinter  ChiSummer) 
(increasing  (Temp  Chicago))) 
f2o  (holdsln  (Interval  ChiSummer  ChiWinter) 
(decreasing  (Temp  Chicago) ) ) 
f2i  (greaterThan  (M  (Temp  Australia)  AusSummer) 
(M  (Temp  Australia)  AusWinter) ) 
f22  (greaterThan  (M  (Temp  Chicago)  ChiSummer) 

(M  (Temp  Chicago)  ChiWinter) ) 


logical  entailment 
temporal  quantifier 


Figure  35:  Network  plotting  explanations  x0  andx7  that  explain  seasonal  change  in  Australia  ( x0 )  and  Chicago 

(x7)  using  a  near-far  model  of  the  seasons. 
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We  have  described  how  the  system  justifies  Chicago  decreasing  in  temperature.  The  system 
justifies  Chicago’s  increase  in  temperature  in  an  analogous  fashion.  It  uses  some  of  the  model 
fragment  instances  created  to  explain  Chicago’s  decrease  in  temperature,  such  as  the 
AstronomicalHeating  instance.  It  also  instantiates  new  model  fragments,  such  as  an 
Approaching-Periodic  instance  whose  rate  decreases  the  earth’s  distance  to  the  sun 
(statement  fg  in  Figure  35)  which  justifies  the  earth’s  increase  in  temperature  (statement  /}  ?  in 
Figure  35). 

After  the  system  has  computed  the  justification  structure  for  the  explanandum,  it  finds  all 
well-founded  explanations  of  the  explanandum  and  creates  a  unique  explanation  node  (e.g.,  xj  in 
Figure  35)  for  each.  As  we  discussed  in  Chapter  4,  multiple  explanations  may  compete  to 
explain  the  same  explanandum.  In  our  simulation  of  Angela,  there  are  multiple  explanations  for 
Chicago’s  seasons,  only  one  of  which  (x/)  is  shown  in  Figure  35.  Consider  the  following 
simplified  explanations  in  English: 

•  X]\  The  earth  retreats  from  the  sun  for  Chicago’s  winter  and  approaches  for  its  summer 
(shown  in  Figure  35). 

•  x’i'.  The  sun’s  temperature  decreases  for  Chicago’s  winter  and  increases  for  its  summer. 

•  xf2'.  The  sun’s  temperature  decreases  for  Chicago’s  winter,  and  the  earth  approaches  the 
sun  for  its  summer. 

•  x’g-  The  earth  retreats  from  the  sun  for  Chicago’s  winter,  and  the  sun’s  temperature 


increases  for  its  summer. 
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Explanations  {xj,x’i,x  ’2,  x’3}  compete  with  each  other  to  explain  fn-  However,  x’i,x  ’2,  and 
x  ’3  are  all  problematic.  Explanations  x  ’2  and  x  ’3  contain  asymmetric  quantity  changes  in  a  cyclic 
state  space:  a  quantity  (e.g.,  the  sun’s  temperature)  changes  in  the  summer  — >  winter  interval 
without  returning  to  its  prior  value  somewhere  in  the  remainder  of  the  state  cycle,  winter  — > 
summer.  Explanation  x  ’/  is  not  structurally  or  temporally  problematic,  but  the  domain  theory 
contains  no  model  fragments  that  can  describe  the  process  of  the  sun  changing  its  temperature. 
Consequently,  the  changes  in  the  sun’s  temperature  are  assumed  rather  than  justified  by  process 
instances.  Assumed  quantity  changes  are  problematic  because  they  represent  unexplainable 
changes  in  a  system.  These  are  also  problematic  under  the  sole  mechanism  assumption  (Forbus, 
1984),  which  states  that  all  changes  in  a  physical  system  are  the  result  of  processes.33  We  have 
just  analyzed  and  discredited  system-generated  explanations  x  ’1.3  which  compete  with 
explanation  x/.  The  system  makes  these  judgments  automatically,  using  the  artifact-based  cost 
function  described  in  Chapter  4. 

The  cost  function  computes  the  cost  of  an  explanation  as  the  sum  of  the  cost  of  new  artifacts 
(e.g.,  model  fragments,  model  fragment  instances,  assumptions,  contradictions,  etc.34)  within  that 
explanation.  In  our  example,  x/  is  the  preferred  (i.e.,  lowest  cost)  explanation,  so  the  system 
assigns  x/  to  the  explanandum  within  the  preferred  explanation  mapping  E,  and  thereby  explains 
Chicago’s  temperature  variation  using  the  near- far  model. 


33  The  agent  might  explicitly  assume  that  an  unknown,  active,  process  is  directly  influencing  the  quantity,  but  such 
an  assumption  is  still  objectively  undesirable  within  an  explanation. 

34  For  a  complete  listing  of  epistemic  artifacts  and  their  numerical  costs,  see  section  4.6.2. 
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6.2.2  Explaining  Australia’s  seasons 

We  next  query  the  system  for  an  explanation  of  why  Australia  is  warmer  in  its  summer  than  in  its 
winter.  This  invokes  justify-explanandum  which  constructs  explanations  for  Australia’s 
seasons,  including  the  explanation  xo  in  Figure  35.  When  the  system  chooses  among  competing 
explanations  for  Australia’s  seasons  using  the  cost  function,  the  cost  of  each  explanation  is 
influenced  by  the  explanations  it  has  chosen  for  previous  explanandums  (e.g.,  Chicago’s 
seasons).  This  is  because  artifacts  only  incur  a  cost  if  they  are  not  presently  used  in  a  preferred 
explanation.  All  else  being  equal,  the  system  is  biased  to  reuse  existing  artifacts  such  as  model 
fragments  (e.g.,  AstronomicalHeating),  model  fragment  instances  (e.g., 
AstronomicalHeating  instance  AH-inst  represented  as  f6  in  Figure  35),  and  assumptions 
that  are  in  other  preferred  explanations.  This  causes  the  system  to  choose  a  near-far  explanation 
for  Australia’s  seasons  (xo  in  Figure  35)  which  contains  much  of  the  justification  structure  of  the 
preferred  explanation  for  Chicago’s  seasons  (x/  in  Figure  35). 

6.2.3  Comparing  the  system’s  explanations  to  student  explanations 

At  this  point,  we  want  the  system  to  describe  the  mechanisms  that  cause  seasonal  change  and 
temperature  change.  Sherin  et  al.  do  not  give  the  interviewees  a  pretest  or  posttest;  rather,  they 
ask  the  student  to  explain  it  freely.  Generating  causal  explanations  in  English  is  outside  the 
scope  of  this  research,  so  we  have  our  system  describe  causal  models  using  influence  graphs  as 
illustrated  in  Figure  36.  Given  one  or  more  explanations,  the  system  automatically  constructs  an 
influence  graph  of  the  explanations  by  (1)  creating  a  vertex  for  every  quantity  described  in  the 
explanation  and  (2)  creating  a  directed  edge  for  every  influence  described  in  the  explanation.  In 
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the  case  of  Figure  36,  the  system  graphs  the  two  preferred  explanations,  so  that  both  Australia’s 
seasons  and  Chicago’s  seasons  are  explainable  using  the  same  mechanisms. 

The  majority  of  the  influence  graph  in  Figure  36  describes  continuous  causal  mechanisms 
that  are  common  to  both  explanations.  The  only  explanation-specific  components  are  the 
temperatures  of  Chicago  and  Australia  and  their  qualitative  proportionalities  to  the  temperature 


P-Rate(  Earth  I  ranslating) 

[ChiSum->ChiWin|;  ^  [ChiWin->ChiSuni]; 

[AusSum->AusWin]  /  \  |  AusWin->AiisSum] 

P-Rate( Retreat)  P-Rate( Approach) 

i+\/  i- 


l)ist(  Earth. Sun) 


femp(Chi) 


Tempi  Earth) 

q+ 

Temp(Aus) 


Figure  36:  An  influence  diagram  of  the  near- far  explanation  of  both 
Chicago’s  (Chi)  and  Australia’s  (Aus)  seasons.  Nodes  are  quantities  and 
edges  describe  positive  and  negative  direct  influences  (i+,  i-)  and  indirect 
influences  (q+,  q-).  Bracketed  ranges  quantify  process  activity. 

of  the  earth.  This  illustrates  how  knowledge  is  reused  across  explanations  and  how  new 


phenomena  are  explained  in  terms  of  existing  causal  structure.  Thus,  even  though  explanations 


exist  as  separate  entities  in  our  computational  model,  they  share  significant  structure. 


6.2.4  Accommodating  new,  credible  information 

Thus  far,  we  have  described  how  the  system  constructs  and  computes  preferences  for  the  two 
explanations  plotted  in  Figure  35:  one  for  how  Chicago’s  seasons  change  (xf)  and  another  for 
how  Australia’s  seasons  change  ( xo ).  Other  explanations  for  Chicago’s  and  Australia’s  seasons 
exist  in  the  system,  but  are  not  preferred  since  they  incur  a  greater  cost. 
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In  Sherin  et  al.’s  study,  recall  that  if  a  student’s  explanation  did  not  account  for  different 
seasons  in  different  parts  on  the  earth  -  like  our  simulation’s  presently-preferred  explanations  - 
the  interviewer  asked  them  whether  they  were  aware  that  Chicago’s  winter  coincided  with 
Australia’s  summer.  This  caused  some  students,  including  Angela,  to  revise  their  explanation  of 
seasonal  change.  This  section  describes  how  we  simulate  the  incorporation  of  new  information 
and  the  subsequent  explanation  revision. 

To  begin,  the  following  statements  are  added  from  the  human  user: 


(cotemporal 

(cotemporal 

(cotemporal 

(cotemporal 


Chi Summer 
ChiAutumn 
ChiWinter 
ChiSpring 


AusWinter ) 
AusSpring) 
Aus Summer ) 
AusAutmn) 


We  refer  to  this  as  the  opposite  seasons  information  .  These  statements  are  from  a  trusted 
source,  so  each  statement  incurs  a  credibility  artifact35  of  cost  -1000  (where  negative  cost 
indicates  a  benefit).  This  means  that  for  each  of  these  four  statements,  the  system  receives  a 
numerical  benefit  as  long  as  it  keeps  the  statement  in  the  adopted  domain  knowledge 
microtheory  Da.  It  will  lose  this  benefit  if  it  removes  the  statement  from  Da,  though  the 
statement  will  continue  to  exist  in  the  general  domain  knowledge  microtheory  ID). 

After  adding  these  statements  to  Da  the  system  searches  for  contradictions  across  its 
preferred  explanations  (i.e.,  xo  andxy  in  Figure  35)  and  adopted  domain  knowledge  in  Da.  This 
is  performed  with  domain-general  rules  for  detecting  contradictions,  such  as: 


35 


See  the  section  4.6.2  for  an  overview  and  example  of  credibility  artifacts. 
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•  A  belief  and  its  negation  cannot  be  simultaneously  believed. 

•  A  quantity  cannot  simultaneously  be  greater  than  n  and  less  than  or  equal  to  n. 

•  A  quantity  cannot  simultaneously  be  less  than  n  and  greater  than  or  equal  to  n. 

The  quantity  rules  also  apply  to  derivatives  of  quantities,  so  the  system  detects  when  quantities 
are  believed  to  simultaneously  increase  and  decrease. 

To  illustrate  this  behavior  within  the  Angela  example,  consider  Australia’s  explanation  xo  = 
(Jo,  Bo,  M0)  and  Chicago’s  explanation  xi  =  (.//,  B  j,  Mi).  According  to  the  definition  of 
explanations  in  Chapter  4,  Bo  is  the  set  of  beliefs  in  xo  and  B\  is  the  set  of  beliefs  in  xi.  Since 
both  explanations  refer  to  the  near-far  model,  the  following  statements  (as  well  as  many  others) 
are  included  in  these  belief  sets: 


Bo  contains  the  temporally-quantified  statement: 

(holdsln  (Interval  AusSummer  AusWinter) 

(decreasing  (Temp  PlanetEarth) ) ) 

(i.e.,  “Between  Australia’s  summer  and  its  winter,  the  earth  cools.”) 


Bi  contains  the  temporally-quantified  statement: 

(holdsln  (Interval  ChiWinter  ChiSummer) 
(increasing  (Temp  PlanetEarth) ) ) 

(i.e.,  “Between  Chicago’s  winter  and  summer,  the  earth  warms.”) 


Before  the  opposite  seasons  infonnation  was  incorporated,  these  statements  were  not 
contradictory.  After  we  add  the  opposite  seasons  information,  the  system  infers  that  the  interval 
from  Australia’s  summer  to  its  winter  coincides  with  the  interval  from  Chicago’s  winter  to  its 
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summer.  Therefore,  the  earth’s  temperature  is  believed  to  increase  and  decrease  simultaneously, 
which  an  impossible  behavior  within  a  physical  system.  This  is  flagged  by  the  contradiction 
detection  rules,  and  the  following  contradiction  artifact  is  created: 


(Contra,  {  (cotemporal  ChiSummer  AusWinter)  , 
(cotemporal  ChiWinter  AusSummer) , 
(holdsln  (Interval  AusSummer  AusWinter) 
(decreasing  (Temp  PlanetEarth) )  )  , 
(holdsln  (Interval  ChiWinter  ChiSummer) 
(increasing  (Temp  PlanetEarth) ) ) }) 


Three  additional  contradictions  are  detected  between  these  explanations:  (1)  the  opposite 
simultaneous  heating/cooling  of  the  earth;  (2)  the  earth  simultaneously  approaching  and 
retreating  for  Chicago  and  Australia,  respectively;  and  (3)  the  earth  simultaneously  retreating  and 
approaching  for  Chicago  and  Australia,  respectively.  Artifacts  are  created  for  these 
contradictions  as  well.  Each  contradiction  artifact  incurs  a  cost  of  100. 

Despite  gaining  numerical  benefits  for  accepting  the  instructional  knowledge  about  opposite 
seasons  in  Chicago  and  Australia,  the  system  has  detected  four  contradictions  and  incurred  the 
respective  costs.  Recall  from  Chapter  4  that  the  cost  of  an  artifact,  such  as  the  contradiction 
artifact  shown  above,  is  only  incurred  if  every  constituent  belief  is  either  (1)  in  the  adopted 
domain  knowledge  microtheory  H3)a  or  (2)  in  the  belief  set  of  a  preferred  explanation  in  the 
explanation  mapping  E.  Consequently,  these  contradiction  costs  might  be  avoided  -  while  still 
retaining  the  credibility  benefits  -  by  revising  the  E  or  Da.  This  involves  retracting  beliefs  from 
ID) a  and  switching  its  preferred  explanation(s)  to  disable  this  contradiction  artifact  and  other 
costly  artifacts.  This  is  the  role  of  the  procedure  restructure-around-artifact  described  in 
Chapter  4  (Figure  23).  When  this  procedure  is  called  with  one  of  the  newly-incurred 
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contradiction  artifacts  as  the  input  argument,  the  procedure  finds  (1)  beliefs  in  Da  that  support 
the  contradiction  (i.e.,  the  two  cotemporal  statements  above)  and  (2)  explanandums  whose 
explanations  support  the  contradiction  (i.e.,  Chicago’s  seasonal  temperature  difference  and 
Australia’s  seasonal  temperature  difference). 

For  each  supporting  belief  in  Da,  the  system  determines  whether  removing  the  belief  from 
Da  will  lower  the  overall  cost.  For  example,  removing  (cotemporal  Chi  Summer 
AusWinter )  from  Da  will  remove  all  four  contradictions  for  a  benefit  of  400,  but  it  would  also 
disable  the  credibility  benefit  of  1000,  so  there  would  be  a  net  loss.  Therefore,  no  change  will  be 
made  here.  The  same  is  true  of  removing  (cotemporal  ChiWinter  AusSummer)  from  Da. 

For  each  supporting  explanandum,  the  system  computes  the  lowest  cost  explanation.  For 
example,  changing  Chicago’s  seasonal  explanation  to  another  explanation  (e.g.,  the  facing 
explanation,  described  above)  revokes  the  beliefs  that  earth’s  temperature  and  distance  from  the 
sun  changes  during  Chicago’s  seasonal  intervals.  The  facing  explanation  was  not  initially  the 
lowest-cost  explanation  for  Chicago’s  seasons,  but  these  contradictions  have  since  made  the  two 
near-far  explanations  much  more  costly. 

When  the  system  changes  its  explanation  for  Chicago’s  seasons  to  the  facing  explanation,  it 
disables  all  four  contradictions;  however,  the  restructure-around-artifact  procedure  is  not  yet 
complete.  When  it  processes  the  final  explanandum,  Australia’s  seasons,  the  system  finds  that  it 
can  further  reduce  cost  by  changing  Australia’s  preferred  explanation  from  the  near-far 
explanation  to  a  facing  explanation.  This  is  because  using  the  same  model  fragments,  model 
fragment  instances,  and  assumptions  as  Chicago’s  new  explanation  (i.e.,  the  facing  model)  is  less 
expensive  than  using  the  near-far  model  to  explain  Australia’s  seasons.  The  system  then  iterates 
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through  the  beliefs  and  explanandums  again  to  determine  whether  additional  unilateral  changes 
can  reduce  cost,  and  since  no  further  action  reduces  cost,  the  procedure  terminates. 

When  the  procedure  tenninates,  both  Chicago’s  and  Australia’s  seasons  have  been  mapped 
to  explanations  that  use  the  facing  model.  The  corresponding  influence  graph  for  both  preferred 
explanations  is  shown  in  Figure  37.  Both  explanations  use  RotatingToward  and 
RotatingAway  processes  to  explain  change  in  temperature,  the  rates  of  which  are  qualitatively 
proportional  to  the  rate  of  the  earth’s  rotation. 


I’-Ratel  t  arth  Rotat  ing) 


Figure  37:  An  influence  diagram  of  the  facing  explanation  of  both 
Chicago’s  (Chi)  and  Australia’s  (Aus)  seasons. 

We  have  just  described  how  the  simulation  accommodates  new  information  by  revising 
explanation  preferences  in  IE  to  reduce  cost.  As  we  discussed  in  Chapter  4,  the  restructuring 
procedure  is  guaranteed  to  converge  because  it  only  performs  belief  revision  if  cost  can  be 
reduced,  and  cost  cannot  be  reduced  infinitely.  Restructuring  is  a  greedy  algorithm,  so  it  is  not 
guaranteed  to  find  the  optimal  cost  configuration  of  explanation  preferences. 

This  concludes  the  Angela  trial.  Like  the  student  Angela,  the  computational  model  begins 
the  session  by  explaining  the  seasons  with  a  near-far  explanation  and  ends  the  session  with  a 
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facing  explanation.  We  simulate  five  of  the  students  from  Sherin  et  al.’s  study,  including 
Angela.  We  continue  with  a  description  of  the  simulation  setup  and  experimental  results. 

6.3  Simulation  results 

We  implemented  our  model  on  top  of  the  Companions  cognitive  architecture  (Forbus  et  ah, 
2009),  ran  each  trial  as  described  above,  and  compared  our  system’s  explanations  to  those  of 
students.  In  each  trial,  the  system  starts  with  a  subset  of  knowledge  pertaining  to  a  student  from 
Sherin  et  al.,  but  no  explanations  have  been  constructed.  In  terms  of  Figure  35,  the  starting  state 
of  the  system  is  a  series  of  nodes  on  the  bottom  (domain  theory)  tier  of  the  network,  but  none 
elsewhere.  The  system  is  then  queried  to  construct  explanations  for  Chicago’s  and  Australia’s 
seasons,  after  which  we  provide  the  simulation  with  the  information  about  opposite  seasons,  and 
query  the  simulation  again  for  an  explanation  of  the  seasons. 
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Figure  38:  Influence  graphs  for  additional  explanations  produced  by  the  simulation,  (a) 
The  tilt  of  the  axis  increases  and  decreases  each  hemisphere’s  distance  to  the  sun.  (b)  A 
simplified  correct  explanation  of  the  seasons. 

The  individual  differences  of  the  students  within  the  interviews  involve  more  than  just 
variations  in  domain  knowledge.  For  example,  some  students  strongly  associate  some  models 
and  beliefs  with  the  seasons  (e.g.,  that  the  earth’s  axis  is  tilted)  without  knowing  the  exact 
mechanism.  To  capture  this  (e.g.,  in  the  “Deidra  &  Angela”  trial  below),  our  system  includes  an 
additional  numerical  penalty  over  beliefs  to  bias  explanation  preference.  We  describe  this 
further  below. 

Ali  &  Kurt  trial.  The  system’s  initial  domain  knowledge  includes:  (1)  the  earth  rotates  on  a 
tilted  axis,  (2)  temperature  is  qualitatively  proportional  to  sunlight,  and  (3)  the  earth  orbits  the 
sun.  However,  there  is  no  knowledge  that  each  hemisphere  is  tilted  toward  and  away  during  the 
orbit.  Consequently,  the  system  computes  nine  explanations  for  both  Chicago  and  Australia,  and 
computes  preference  for  the  facing  explanations  shown  in  Figure  37,  with  a  cost  of  56.  This 
explanation  is  consistent  with  the  opposite  seasons  information,  so  no  revision  occurs  as  a  result. 
Like  Ali  and  Kurt,  the  simulation  starts  and  ends  with  the  facing  explanation. 
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Deidra  &  Angela  trial.  The  system’s  initial  domain  knowledge  includes:  (1)  the  earth 
rotates,  (2)  the  earth  orbits  the  sun  and  is  sometimes  closer  and  sometimes  farther,  and  (3) 
sunlight  and  proximity  to  the  sun  both  affect  temperature.  To  model  Deidra  and  Angela’s 
preference  for  the  distance-based  explanation,  for  this  trial  we  used  an  additional  ten-point  cost 
on  the  belief  (qprop  (Temp  x)  (Sunlight  X)).  Under  these  parameter  settings,  the 
system  constructs  16  explanations36  and  computes  a  preference  for  the  near- far  explanations 
graphed  in  Figure  36,  with  a  cost  of  56.  The  system  also  created  facing  explanations  (graphed  in 
Figure  37)  with  a  cost  of  66,  due  to  an  additional  ten-point  penalty  on  the  belief  (qprop  (Temp 
X)  (Sunlight  X)  )  .  This  penalty  makes  the  facing  explanation  more  expensive  than  the  near- 
far  explanation.  When  confronted  with  the  opposite  seasons  information,  the  system  (like  Deidra 
and  Angela)  detects  inconsistencies  and  changes  its  preferred  explanation  from  the  near-far 
explanations  to  the  facing  explanations. 

Amanda  trial.  The  system’s  initial  domain  knowledge  includes:  (1)  the  earth  orbits  the  sun, 
(2)  the  earth  rotates  on  a  tilted  axis,  (3)  when  each  hemisphere  is  tilted  toward  the  sun,  it  receives 
more  sunlight  and  is  more  proximal  to  the  sun,  and  (4)  sunlight  and  proximity  to  the  sun  both 
affect  temperature.  In  the  interview,  Amanda  mentions  two  main  influences  on  Chicago’s 
temperature:  (1)  the  distance  to  the  sun  due  to  the  tilt  of  the  earth,  and  (2)  the  amount  of  sunlight, 
also  due  to  the  tilt  of  the  earth.  Through  the  course  of  the  interview,  she  settles  on  the  latter. 
Amanda  could  not  identify  the  mechanism  by  which  the  tilt  changes  throughout  the  year.  We 
simulated  Amanda  once  with  process  models  for  TiltingToward,  and  TiltingAway, 
producing  graphs  Figure  38(a)  and  Figure  38(b)  with  costs  52  and  67,  respectively.  However, 
since  Amanda  could  not  identify  the  processes  that  increased  and  decreased  the  tilt  of  the  earth, 

36  The  increased  number  of  explanations  is  due  to  the  belief  that  proximity  in  addition  to  amount  of  sunlight  affect 
temperature. 
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we  simulated  her  again  without  these  process  models.  This  produced  two  similar  graphs,  but 
without  anything  affecting  AxisTilt-Toward  (Earth,  Sun)  .  This  was  the  final  model  that 
the  student  Amanda  chose  as  her  explanation.  The  graphs  in  Figure  38  both  describe  the  tilt  of 
the  earth  as  a  factor  of  the  seasons:  graph  (a)  is  incorrect  because  it  describes  tilt  affecting 
distance  and  temperature,  and  graph  (b)  is  a  simplified  correct  model. 

By  varying  the  domain  knowledge  and  manipulating  the  numerical  costs  of  beliefs,  we  can 
use  the  simulation  to  (1)  construct  student  explanations  and  (2)  revise  explanations  under  the 
same  conditions  as  students.  Further,  in  the  Amanda  trial,  we  provided  additional  process 
models  to  demonstrate  that  the  simulation  can  construct  a  simplified  correct  explanation. 

6.4  Discussion 

In  summary,  this  simulation  (1)  constructs  explanations  from  available  domain  knowledge  via 
abductive  model  fonnulation,  (2)  evaluates  the  resulting  explanations  using  a  cost  function,  and 
(3)  detects  inconsistencies  and  re-evaluates  its  explanations  when  given  new  infonnation.  By 
changing  the  initial  knowledge  of  the  system,  we  are  able  to  simulate  different  interviewees’ 
commonsense  scientific  reasoning  regarding  the  changing  of  the  seasons.  We  also  demonstrated 
that  the  system  can  construct  the  scientifically  correct  explanation  using  the  same  knowledge 
representation  and  reasoning  approaches. 

This  simulation  supports  the  claim  that  model  fragments  can  simulate  mechanism-based 
psychological  mental  models.  This  is  because  model  fragments  (e.g.,  those  in  Figure  33)  were 
used  to  describe  processes  and  conceptual  entities,  and  were  able  to  capture  the  causal 
mechanisms  of  students’  explanations.  This  simulation  also  supports  the  third  claim  of  this 
dissertation:  that  conceptual  change  -  in  this  case,  mental  model  transfonnation  -  can  be 
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simulated  by  constructing  and  evaluating  explanations.  The  “Deidra  &  Angela”  trial  exemplifies 
this  behavior  by  shifting  explanations  and  underlying  influence  graphs  (i.e.,  from  that  in  Figure 
36  to  that  in  Figure  37),  which  represent  different  student  mental  models. 

The  numerical  explanation  scoring  strategy  used  in  this  simulation  is  domain-general,  albeit 
incomplete.  To  be  sure,  other  factors  not  addressed  by  our  cost  function  are  also  important 
considerations  for  explanation  evaluation:  belief  probability,  epistemic  entrenchment,  diversity 
of  knowledge,  level  of  specificity,  familiarity,  and  the  variable  credibility  of  information  (and 
information  sources).  Incorporating  these  factors  will  help  model  individual  differences  in 
response  to  instruction  (e.g.,  Feltovich  et  ah,  2001).  We  discuss  some  possible  extensions  in 
Chapter  9. 

We  believe  that  this  simulation  is  doing  much  more  computation  than  people  to  construct 
the  same  explanations.  For  example,  the  system  computed  and  evaluated  16  explanations  in  the 
Deidra  &  Angela  trial  when  explaining  Chicago’s  seasons.  As  described  in  Chapter  4,  our 
system  uses  an  abductive  model  formulation  algorithm,  followed  by  a  complete  meta-level 
analysis  of  competing  explanations.  People  probably  use  a  more  incremental  approach  to 
explanation  construction,  where  they  interleave  meta-level  analysis  within  their  model-building 
operations.  Such  an  approach  would  avoid  reifying  explanations  that  are  known  to  be 
problematic  (e.g.,  explanations  x  ’1.3  in  section  6.2.1),  but  it  would  involve  monitoring  the  model 
formulation  process.  The  transcript  of  Angela’s  interview  in  the  appendix  helps  illustrate  the 
incremental  nature  of  psychological  explanation  construction:  Angela  appears  to  construct  a 
second  explanation  only  after  she  realizes  that  her  initial  explanation  is  flawed. 

This  simulation  demonstrates  that  our  computational  model  can  reactively  revise  its 
explanations  to  maintain  consistency  and  simplicity.  However,  this  does  not  capture  the  entirety 
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of  conceptual  change,  or  even  the  entirety  of  mental  model  transformation.  For  instance,  Angela 
and  Deidra  incorporated  new  information  that  forced  them  to  recombine  pre-existing  knowledge 
into  a  new  explanation,  but  they  did  not  have  to  incorporate  unfamiliar  information  about 
astronomy  into  their  explanations.  In  contrast,  when  students  learn  from  fonnal  instruction  or 
read  from  a  textbook,  they  often  encounter  infonnation  about  new  entities,  substances,  and 
physical  processes  that  must  be  incorporated  into  their  current  mental  models.  This  is  the  subject 
of  the  simulation  described  in  the  next  chapter. 
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Chapter  7:  Mental  model  transformation  from  textbook  information 

The  last  chapter  simulated  the  revision  of  mechanism-based  mental  models  when  new 
information  causes  inconsistencies.  Formal  instruction  can  involve  more  subtle  conflicts  than 
this,  such  as  learning  about  a  biological  system  at  a  finer  granularity  and  making  sense  of  new 
entities  and  processes.  Consider  the  mental  model  transformation  example  from  Chapter  4:  a 
student  believes  that  blood  flows  from  a  single-chambered  heart,  through  the  human  body,  and 
back,  and  then  reads  that  blood  actually  flows  from  the  left  side  of  the  heart  to  the  body.  This 
new  information  does  not  directly  contradict  the  student’s  mental  model  since  the  text  does  not 
explicitly  state  that  blood  does  not  flow  from  the  heart;  rather,  the  new  information  is  more 
specific  than  the  student’s  present  mental  model,  and  the  conflict  between  beliefs  and  models  is 
not  as  overt  as  it  was  in  the  previous  chapter.  This  simulation  constructs  and  evaluates 
explanations  -  similar  to  the  previous  chapter’s  simulation  -  to  incrementally  transform 
compositional  qualitative  models  when  provided  a  stream  of  textbook  information.  We  simulate 
the  students  in  Chi  et  al.  (1994a)  who  complete  a  pretest  about  the  circulatory  system,  read  a 
textbook  passage  on  the  topic,  and  then  complete  a  posttest  to  assess  their  learning. 

Recall  from  Chapter  2  that  act  of  explaining  to  oneself  helps  people  revise  flawed  mental 
models  (Chi  et  al.,  1994a;  Chi,  2000).  Chi  et  al.  (1994a)  showed  that  when  students  are 
prompted  to  explain  concepts  to  themselves  while  reading  a  textbook  passage  about  the  human 
circulatory  system,  they  experience  a  greater  gain  in  learning  than  students  who  read  each 
sentence  of  the  passage  twice.  Chi  and  colleagues  call  this  the  self-explanation  effect.  Chi 
(2000)  describes  how  self-explanation  causes  mental  model  transfonnation: 
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1 .  Explaining  the  new  information  causes  the  recognition  of  qualitative  conflicts  (i.e., 
different  predictions  and  structure)  between  the  mental  model  and  the  model  presented 
in  the  textbook. 

2.  The  conflict  is  propagated  in  the  mental  model  to  find  contradictions  in  the 
consequences. 

3.  The  mental  model  is  repaired  using  elementary  addition,  deletion,  concatenation,  or 
feature  generalization  operators. 

The  self-explanation  effect  is  central  to  our  computational  model,  but  we  do  not  implement 
it  according  to  Chi’s  (2000)  description.  Our  simulation  simulates  the  psychological  self¬ 
explanation  effect  by: 

1 .  Constructing  new  explanations  using  new  textbook  information. 

2.  Evaluating  the  new  explanations  alongside  previous  ones. 

3.  Re -mapping  explanandums  to  new  explanations  when  preferences  are  computed  as  such. 

As  shown  in  the  previous  chapter’s  simulation,  changing  the  preferred  explanation  for  an 
explanandum  can  simulate  belief  revision.  We  describe  this  process  in  detail  below. 

Like  the  previous  simulation,  this  simulation  uses  qualitative  models  to  simulate  students’ 
mental  models,  and  it  uses  the  same  central  model  of  conceptual  change.  Consequently,  this 
simulation  provides  additional  support  for  the  first  and  third  claims  of  this  dissertation: 
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Claim  1 :  Compositional  qualitative  models  provide  a  psychologically  plausible 
computational  account  of  human  mental  models. 

Claim  3:  Human  mental  model  transformation  and  category  revision  can  both  be  modeled 
by  iteratively  (1)  constructing  explanations  and  (2)  using  meta-level  reasoning  to  select 
among  competing  explanations  and  revise  domain  knowledge. 

We  briefly  discuss  Chi  et  al.’s  (1994a)  study,  which  is  the  basis  for  comparison  in  this 
simulation.  We  then  discuss  how  textbook  information  is  integrated  into  our  system  via 
explanation  construction,  and  the  results  of  our  simulation  (published  as  Friedman  &  Forbus, 
2011). 

7.1  Self-explaining  improves  student  accommodation  of  textbook  material 

Chi  et  al.  (1994a)  studied  the  self-explanation  effect  on  21  eighth-grade  students.  Each  student 
was  given  a  pretest  to  assess  their  knowledge  of  the  human  circulatory  system.  Each  student 
then  read  a  101-sentence  textbook  passage  about  the  circulatory  system,  after  which  they 
completed  a  posttest.  There  were  two  conditions:  the  control  group  (9  students)  read  each 
sentence  in  the  passage  twice,  and  the  experimental  group  (12  students)  read  the  passage  once, 
but  was  prompted  by  the  experimenter  to  explain  portions  of  the  text  throughout  the  reading. 

Part  of  the  pretest  and  posttest  involved  plotting  the  flow  of  oxygen-rich  and  oxygen-poor 
blood  through  the  human  body,  using  arrows  between  various  parts  of  the  body.  The  tests  also 
included  conceptual  questions  about  the  behavior  and  function  of  circulatory  system 
components.  The  mental  models  found  by  the  experimenters  are  shown  in  Figure  39:  the  first 
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five  are  incorrect,  and  the  final  “double  loop  (2)”  model  is  a  correct  but  simplified  model.  We 
describe  each  from  left  to  right: 


|  Body  |  |  Body  Body  :  4.  Body 

No  Loop  Ebb  &  Flow  Single  Loop  Single  Loop  (Lung) 


Ebb  &  Flow  Single  Loop  single  Loop  (Lung;  Double  Loop  (1)  Double  Loop  (2) 

Figure  39:  Student  models  of  the  human  circulatory  system  from  Chi  et  al.  (1994a). 


1 .  No  loop:  blood  flows  from  a  single-chambered  heart  to  the  body  and  does  not  return. 

2.  Ebb  and  flow :  blood  flows  from  heart  to  the  body  and  returns  to  the  heart  through  the 
same  blood  vessels. 

3.  Single  loop :  blood  flows  from  heart  to  body  through  one  set  of  vessels  and  returns  to 
the  heart  through  an  entirely  different  set  of  vessels. 

4.  Single  loop  (lung)',  blood  flows  in  a  heart-lung-body  or  heart-body-lung  cycle  and 
the  lungs  play  a  role  in  oxygenating  blood. 

5.  Double  loop  (1):  blood  flows  directly  from  heart  to  both  lungs  and  body,  and  blood 
returns  directly  to  the  heart  from  the  lungs  and  body. 

6.  Double  loop  (2):  same  as  double  loop  (1),  except  the  heart  has  four  chambers,  blood 
flows  top-to-bottom  through  the  heart,  and  at  least  three  of  the  following: 

•  Blood  flows  from  right  ventricle  to  lungs 

•  Blood  flows  from  lungs  to  left  atrium 

•  Blood  flows  from  left  ventricle  to  body 

•  Blood  flows  from  body  to  right  atrium 
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Control  group 


Figure  40:  Transitions  between  pretest  and  posttest  models  for  control  and  prompted 
groups  in  Chi  et  al.  (1994a).  Numbers  indicate  the  number  of  students  who  made  the 
given  transition.  See  Figure  39  for  an  illustration  of  each  mental  model. 

The  experimenters  found  that  the  prompted  group  experienced  a  significant  gain  in  learning 
relative  to  the  control  group,  and  that  prompted  students  who  self-explained  most  frequently 
achieved  the  “double  loop  (2)”  model  on  the  posttest.  In  total,  33%  of  the  control  group  and 
66%  of  the  prompted  group  reached  the  correct  mental  model  at  the  posttest.  Results  are 
summarized  in  Figure  40,  with  respect  to  the  models  shown  in  Figure  39. 

Figure  40  shows  that  some  students  in  the  control  group  who  started  with  the  same  model  on 
the  pretest  ended  with  different  models  in  the  posttest.  This  is  indicated  by  the  fork  at  “No 
Loop”  (i.e.,  two  of  these  students  end  at  “No  Loop,”  and  the  remaining  student  transitions  to 
“Double  Loop  (1)”),  and  the  fork  at  “Single  Loop”  (i.e.,  two  of  these  students  transition  to 
“Double  Loop  (1)”  and  the  remaining  student  transitions  to  “Double  Loop  (2)”).  This  means  that 
factors  other  than  the  starting  model  affect  students’  learning  on  this  task.  We  broadly  refer  to 
these  factors  as  individual  differences.  Students  in  the  control  group  were  largely  left  to  learn 
according  to  their  individual  learning  strategies,  while  students  in  the  prompted  group  were 
influenced  by  prompting  of  the  experimenter.  Our  simulation  attempts  to  capture  (1)  the 
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individual  differences  of  the  control  group  using  different  explanation  evaluation  strategies  and 
(2)  the  majority  of  the  prompted  group  using  a  single  explanation  evaluation  strategy. 

7.2  Simulating  the  self-explanation  effect 

This  simulation  is  laid  out  similarly  to  the  previous  chapter’s  simulation.  The  input  to  the  system 
includes:  (1)  starting  domain  knowledge;  (2)  a  single  preference  ranking37  for  computing 
preferences  over  explanations;  and  (3)  a  sequence  of  scenario  microtheories  containing 
information  from  a  textbook  passage.  The  information  in  the  passage  was  hand-translated  by  me 
into  predicate  calculus.  Items  1  and  2  vary  across  simulation  trials  to  simulate  different  students, 
and  item  3  is  constant  over  all  trials.  Each  trial  of  this  simulation  proceeded  as  follows: 

1.  Begin  the  trial  with  domain  knowledge  specific  to  one  of  the  six  mental  models  shown  in 
Figure  39.  No  explanations  are  present. 

2.  Construct  explanations  for  all  blood  flows  believed  to  exist  in  the  domain  theory. 

3.  Generate  an  influence  graph  of  all  flows  of  blood,  oxygen,  and  carbon  dioxide  from  the 
union  of  preferred  explanations.  This  validates  the  initial  circulatory  model,  for 
comparison  with  student  pretests. 

4.  Incrementally  integrate  textbook  infonnation  about  the  circulatory  system  from  a 
sequence  of  scenario  microtheories. 

5.  After  all  of  the  textbook  infonnation  has  been  integrated,  generate  influence  graphs  again 
from  the  union  of  preferred  explanations  as  done  in  step  (3).  This  determines  the  final 
circulatory  model,  for  comparison  with  student  posttests. 

37  See  section  4.6.1  for  how  preference  rankings  affect  explanation  preferences. 
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When  a  container  con  physically  contains 
a  type  of  substance  sub,  a  contained  fluid 
exists.  When  there  is  a  positive  amount  of 
sub  in  con,  the  volume  of  con  negatively 
influences  the  pressure  of  this  contained 
fluid. 


ModelFragment  FluidFlow 
Participants  : 

?source-con  Container  (outOf-Container ) 

?sink-con  Container  ( into-Container ) 

?source  ContainedFluid  ( f romLocation) 

?sink  ContainedFluid  (toLocation) 

?path  Path-Generic  (along-Path) 

?sub  StuffType  (substanceOf ) 

Constraints : 

(substanceOf  ?source  ?sub) 

(substanceOf  ?sink  ?sub) 

(containerOf  ?source  ?source-con) 

(containerOf  ?sink  ?sink-con) 

(permitsFlow  ?path  ?sub 

?source-con  ?sink-con) 

Conditions : 

(unobstructedPath  ?path) 

(greaterThan  (Pressure  ?source) 

(Pressure  ?sink) ) ) 

Consequences  : 

(greaterThan  (Rate  ?self)  Zero) 

(i-  (Volume  ?source)  (Rate  ?self) ) 

(it  (Volume  ?sink)  (Rate  ?self) ) 

Figure  41:  ContainedFluid  (above)  and  FluidFlow  (below)  model  fragments  used  in 
the  simulation.  English  interpretations  of  each  model  fragment  (at  right). 

We  use  the  simulation’s  influence  graphs  from  steps  (3)  and  (5)  to  assess  the  simulation’s 

learning  and  compare  it  to  the  mental  model  transformations  of  Chi  et  al.’s  students  in  Figure  40. 

We  have  already  described  the  explanation  construction  procedures  in  detail:  section  4.4 

describes  how  an  explanation  of  heart-to-body  blood  flow  is  constructed  in  this  simulation.  This 

is  the  essence  of  simulation  step  (2)  above.  Additionally,  Chapter  6  describes  how  influence 

graphs  are  constructed  from  multiple  preferred  explanations,  which  is  the  essence  of  steps  (3) 

and  (5)  in  this  simulation.  We  therefore  concentrate  on  step  (4)  of  the  simulation:  incrementally 

integrating  textbook  information. 


When  two  contained  fluids  -  a  source  and 
a  sink  -  are  connected  by  a  path,  and  both 
are  of  the  same  type  of  substance,  a  fluid 
flow  exists.  When  the  path  is 
unobstructed  and  the  pressure  of  source  is 
greater  than  the  pressure  of  sink,  the  rate 
of  the  flow  is  positive  and  it  decreases  the 
volume  of  source  and  increases  the 
volume  of  sink. 


ModelFragment  ContainedFluid 
Participants : 

?con  Container  (containerOf) 

?sub  StuffType  (substanceOf) 

Constraints : 

(physicallyContains  ?con  ?sub) 
Conditions : 

(greaterThan  (Amount  ?sub  ?con)  Zero) 
Consequences : 

(qprop-  (Pressure  ?self)  (Volume  ?con) ) 
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7.2.1  Explanandums:  situations  that  require  an  explanation 

Unlike  the  simulation  in  the  last  chapter,  the  explanandums  in  this  simulation  are  not  single 
statements.  Rather,  each  explanandum  describes  a  single  situation,  such  as: 

(isa  naiveH2B  PhysicalTransfer) 

(outOf-Container  naiveH2B  heart) 

(into-Container  naiveH2B  body) 

(substanceOf  naiveH2B  Blood) 

These  four  statements  describe  a  situation  called  naiveH2B.  The  isa  statement  identifies  it  as  a 
PhysicalTransfer  instance,  and  the  outOf-Container,  into-Container,  and 
substanceOf  statements  identify  the  entities  that  fill  these  roles  of  naiveH2B.  Although  the 
situation  is  described  across  four  statements,  the  situation  itself  (naiveH2B)  is  the  explanandum. 
Consider  another  explanandum  situation  called  leftH2B: 

(isa  leftH2B  PhysicalTransfer) 

(outOf-Container  leftH2B  1-heart) 

(into-Container  leftH2B  body) 

(substanceOf  leftH2B  Blood) 

Using  multiple  statements  to  describe  explanandums  allows  us  to  describe  events  with 
incomplete  information.  A  more  complete  account  of  flow  would  also  mention  the  paths  through 
which  the  substance  travels  from  source  to  destination.  The  path  is,  after  all,  a  component  of  our 
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FluidFlow  model  fragment  in  Figure  41.  Formal  instruction  does  not  always  provide  all  of  the 
information  about  the  components  of  a  natural  system,  especially  when  systems  are  described 
from  the  top-down.  For  example,  consider  the  following  sentence  from  the  textbook  passage 
used  by  Chi  et  al.: 

“Blood  returning  to  the  heart  [from  the  body] . . .  enters  the  right  atrium.” 

A  more  complete  passage  would  mention  the  superior  and  inferior  vena  cava,  but  these  are 
omitted,  perhaps  to  keep  focus  on  the  more  general  function  and  structure  of  the  system. 
Consequently,  students  must  assume  the  existence  of  a  flow  path  from  the  body  to  the  right 
atrium.  Likewise,  our  simulation  assumes  the  existence  of  entities  to  fill  the  roles  of  model 
fragments  when  necessary,  using  the  abductive  mechanism  described  in  section  4.4. 

7.2.2  Constructing  explanations  to  generate  the  pre-instructional  model 

When  a  simulation  trial  begins,  there  are  no  justifications  or  explanations  in  the  system.  The 
simulation  has  the  following  information  in  its  domain  knowledge  microtheory:  (1)  a  set  of 
model  fragments  including  those  in  Figure  41;  (2)  propositional  beliefs  about  the  structure  of  the 
circulatory  system;  and  (3)  a  set  of  explanandum  situations  (described  above)  pertaining  to  a 
single  model  of  the  circulatory  system  shown  in  Figure  39.  For  example,  a  simulation  trial  that 
begins  with  the  “single  loop”  model  contains  the  following  explanandum  situations: 

•  Blood  flows  from  the  heart  to  the  body  (i.e.,  naiveH2B). 

•  Blood  flows  from  the  body  to  the  heart. 
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(a)  (b) 


Legend 

fo 

(isa  heart  Heart) 

fn 

(isa  (SkolemFn  mfi2  ...)  Path-Generic) 

fi 

(physicallyContains  heart  Blood) 

f  14 

(permitsFlow  (SkolemFn  mfi2  ...)  ...) 

fi 

(isa  Blood  StuffType) 

f  15 

(isa  mfi2  FluidFlow) 

fi 

(isa  body  WholeBody) 

f  16 

(f romLocation  mfi2  mfiO) 

fi 

(physicallyContains  body  Blood) 

f  17 

(toLocation  mfi2  mfil) 

mfo 

(isa  ContainedFluid  Model Fragment ) 

f 

(greaterThan  (Amount  Blood  heart)  0) 

fn 

(describes  mfi2  naiveH2B) 

fs 

(isa  mfiO  ContainedFluid) 

fn 

(isa  naiveH2B  PhysicalTransf er) 

h 

(substanceOf  mfiO  Blood) 

fi4 

(substanceOf  naiveH2B  Blood) 

fs 

(containerOf  mfiO  heart) 

fi5 

(outOf -Container  naiveH2B  heart) 

mfi 

(isa  FluidFlow  ModelFragment) 

fi6 

(into-Container  naiveH2B  body) 

Figure  42:  A  portion  of  explanation-based  network,  (a)  Before  an  explanation  has  been 
constructed  for  naiveH2B.  (b)  After  an  explanation  xo  has  been  constructed  for  naiveH2B 

via  abductive  model  formulation. 

It  also  contains  the  following  information  in  its  domain  theory: 


•  A  vessel  path  pathH2B  permits  blood  flow  from  the  heart  to  the  body. 

•  A  different  vessel  path  pathB2H  permits  blood  flow  from  the  body  to  the  heart. 


For  an  even  simpler  example,  consider  the  “no  loop”  circulatory  model  in  Figure  39  described 
above.  This  is  simulated  by  providing  the  simulation  with  the  naiveH2B  information  and  no 
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information  about  paths.  The  existence  of  a  path  will  be  assumed  (i.e.,  without  committing  to  a 
specific  blood  vessel  or  pathway)  for  the  flow. 

All  of  the  starting  explanandums  and  propositional  beliefs  are  contextualized  within  scenario 
micro  theories.  Each  of  these  scenario  microtheories  is  tagged  as  a  starting  microtheory  (i.e.,  it 

existed  prior  to  instruction)  by  labeling  the  informationSource  of  the  micro  theory  as  Self- 
Token  (i.e.,  the  symbol  denoting  the  simulation  itself).  This  is  important,  since  the  simulation 
will  later  resolve  explanation  competition  based  on  the  informationSource  of  the  constituent 
beliefs. 

The  next  step  is  to  construct  an  explanation  for  each  starting  explanandum.  The  system 
automatically  detects  explanandums  by  querying  for  situations  that  match  a  specific  pattern: 
descriptions  of  processes  (e.g.,  blood  flow,  oxygen  consumption)  that  are  not  themselves  model 
fragment  instances.  For  each  explanandum,  the  system  uses  the  justify-explanandum  procedure 
and  subsequent  justify-process  procedure  to  construct  an  explanation,  both  of  which  are 
described  in  Chapter  4.  Consider  the  simple  case  of  starting  with  the  “no  loop”  student  model. 
Figure  42(a)  shows  the  system’s  state  prior  to  explaining  naiveH2B,  and  Figure  42(b)  shows  the 
same  portion  of  the  network  after  explanation  xq  is  constructed  for  naiveH2B. 
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See  section  4.2  for  discussion  of  scenario  microtheories. 
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Figure  43:  Influence  graphs  generated  by  the  system  to  describe  the  relative 
concentrations,  infusion,  and  consumption  of  Oxygen.  Left:  using  “Double  loop  (1)” 
model.  Right:  using  “Double  loop  (2)”  model. 

Key:  R(jc)=Rate  of  process  of  type  x;  Amt(x,  y)=Amount  of  x  in  y;  C(.v,y)=Concentration  of 
x  in  y;  B(jc)=Blood  in  region  x;  (R/L)A=R-/L-Atrium;  (R/L)V=R-/L-Ventricle. 


7.2.3  Determining  the  simulation’s  circulatory  model 


Students  in  Chi  et  al.  were  asked  to  draw  the  blood  flow  in  the  human  circulatory  system  as  part 
of  their  pretest  and  posttest  assessment.  We  assess  our  simulation’s  circulatory  model  twice:  (1) 
after  explaining  the  starting  explanandums  and  (2)  after  integrating  the  textbook  information. 
Both  of  these  assessments  are  conducted  by  having  the  system  automatically  generate  influence 
graphs.  This  is  accomplished  with  the  following  steps: 

1 .  Find  all  explanandums  M  in  the  adopted  domain  knowledge  microtheory  Da  that 
describe  the  transfer,  consumption,  or  infusion  of  blood,  Oxygen,  or  Carbon  Dioxide. 

2.  Using  the  explanandum  mapping  E  described  in  Chapter  4,  locate  identify  the  preferred 
explanations  X  for  each  explanandum  M. 
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3.  Using  all  of  the  beliefs  of  explanations  X,  construct  an  influence  graph  describing 
transfers,  consumption,  or  infusion  of  blood,  Oxygen,  or  Carbon  Dioxide. 

This  produces  between  one  and  three  influence  graphs,  since  all  student  circulatory  models 
in  Chi  et  al.  describe  the  transfer  of  blood,  but  not  all  of  them  describe  oxygen  and  carbon 
dioxide  (see  Figure  39  and  the  corresponding  coding  criteria).  Influence  graphs  describing 
Oxygen  are  shown  in  Figure  43  for  two  different  circulatory  models:  “double  loop  (1)”  (left)  and 
“double  loop  (2)”  (right).  The  “double  loop  (1)”  graph  describes  oxygenated  blood  flowing  from 
the  lung  to  the  heart  via  a  vein  pathway  VeinO,  where  it  mixes  with  deoxygenated  blood  from 
the  body,  flowing  to  the  heart  via  vein  pathway  Veinl.  The  “double  loop  (2)”  graph  has  no  such 
mixture. 

Influence  graphs  constitute  a  partial  comparison  to  the  students  in  Chi  et  al.,  since  the 
students  also  completed  a  quiz  about  the  function  of  the  circulatory  system.  Influence  graphs 
effectively  map  the  simulation’s  circulatory  model  onto  the  space  of  student  models  in  Figure  39, 
but  it  does  not  directly  measure  the  simulation’s  knowledge  about  the  function  of  the  circulatory 
system  and  its  impact  on  human  nutrition. 

7.2.4  Integrating  textbook  information 

At  this  point,  the  system  has  (1)  constructed  explanations  for  each  starting  explanandum  and  (2) 
generated  an  influence  graph  to  describe  its  initial  circulatory  model.  This  section  describes  how 
textbook  information  is  integrated  to  incrementally  transform  this  circulatory  model.  The 
portion  of  the  textbook  passage  used  by  our  simulation  is  listed  in  the  appendix.  For  the 
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remainder  of  this  section,  we  suppose  that  the  simulation  started  with  the  “no  loop”  model  of  the 
circulatory  system  discussed  above. 

We  present  the  textbook  information  in  small  increments,  as  a  sequence  of  scenario 
micro  theories.  Unlike  the  starting  scenario  microtheories  with  Self-Token  as  the  source  of 
information,  these  microtheories  are  encoded  with  source  instruction.  Otherwise,  they  only 
vary  in  content.  The  first  sentence  from  the  textbook  passage  describes  the  general  structure  of 
the  heart:  “The  septum  divides  the  heart  lengthwise  into  two  sides.”  The  corresponding  scenario 
micro  theory  contains  the  following  facts: 

(isa  septum  Septum) 

(physicallyContains  heart  septum) 

(isa  1-heart  (LeftRegionFn  Heart) ) 

(isa  r-heart  (RightRegionFn  Heart) ) 

(partitionedlnto  heart  1-heart) 

(partitionedlnto  heart  r-heart) 

(between  1-heart  r-heart  septum) 

(rightOf  r-heart  1-heart) 

First,  the  adopted  domain  knowledge  microtheory  Da  is  added  as  a  child  of  the  new  scenario 
microtheory  so  that  the  new  information  is  visible  from  this  context.  This  scenario  microtheory 
does  not  contain  an  explanandum,  so  nothing  new  requires  an  explanation.  However,  new 
entities  are  described,  including  septum,  1-heart,  and  r-heart.  These  entities  did  not  exist 
in  the  simulation’s  “no  loop”  circulatory  system  model.  Consequently,  the  simulation  uses  the 
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preference  rules  described  in  Chapter  4  to  encode  preferences  over  entities,  where  possible.  The 
following  preferences  are  computed: 

1.  (isa  heart  Heart)  <£(isa  1-heart  (Lef tRegionFn  Heart)) 

2.  (isa  heart  Heart)  <£  (isa  r-heart  (RightRegionFn  Heart)) 

3.  (isa  heart  Heart)  <J.(isa  1-heart  (Let tRegionFn  Heart)) 

4.  (isa  heart  Heart)  <lc  (isa  r-heart  (RightRegionFn  Heart)) 

5.  (isa  1-heart  (LeftRegionFn  Heart)  )  <”  (isa  heart  Heart) 

6.  (isa  r-heart  (RightRegionFn  Heart) )  <” (isa  heart  Heart) 

Preferences  1  and  2  are  specificity  (5)  preferences,  and  are  computed  based  on  the 
specificity,  since  heart  is  partitionedlnto  the  subregions  r-heart  and  1-heart. 
Preferences  3  and  4  are  instruction  (/)  preferences:  since  1-heart  and  r-heart  are  both 
comparable  to  heart  for  specificity  and  are  supported  by  instruction  (i.e.,  with  information 
source  instruction),  they  are  preferred  in  this  (/)  dimension.  Finally,  preferences  5  and  6  are 
prior  knowledge  (n)  preferences:  since  1-heart  and  r-heart  are  both  comparable  to  heart 
for  specificity,  but  neither  were  present  prior  to  instruction  (as  was  heart,  with  information 
source  Self -Token),  heart  is  preferred  in  this  (n)  dimension. 

The  next  scenario  microtheory  describes  the  sentence  “The  right  side  pumps  blood  to  the 
lungs,  and  the  left  side  pumps  blood  to  other  parts  of  the  body,”  and  contains  the  following 
statements: 


(physicallyContains  r-heart  Blood) 
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(physicallyContains  1-heart  Blood) 
(physicallyContains  lung  Blood) 


(isa  rightH2L  PhysicalTransfer) 

(outOf-Container  rightH2L  r-heart) 

(into-Container  rightH2L  lung) 

(substanceOf  rightH2L  Blood) 

(isa  leftH2B  PhysicalTransfer) 

(outOf-Container  leftH2B  1-heart) 

(into-Container  leftH2B  body) 

(substanceOf  leftH2B  Blood) 

This  scenario  microtheory  describes  two  processes:  rightH2B  describes  blood  flow  from 
right-heart  to  lungs  and  lef  tH2B  describes  blood  flow  from  left-heart  to  body.  Preferences  can 
be  computed  between  explanandums  provided  the  following  rule: 

If  one  explanandum  ei  has  one  or  more  role  fillers  (e.g.,  1-heart  in  leftH2B)  that  are 
preferred  for  specificity  <sc  over  the  corresponding  role  filler  of  another  explanandum  e2 
(e.g.,  heart  of  naiveH2B),  and  all  other  corresponding  role  fillers  that  are  not  preferred 
are  identical,  encode  a  specificity  preference  ei  <sc  C2. 


This  rule  is  domain  general,  since  it  describes  specificity  over  all  events;  not  just  physical 
transfers  and  blood  flows.  Recall  that  in  our  example,  the  simulation  starts  with  the  “no  loop” 
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model.  This  means  that  before  encountering  lef  tH2B,  the  network  contains  only  explanandum 
naiveH2B.  This  means  that  the  following  preference  will  be  computed: 

1.  naiveH2B  <£  leftH2B 

2.  naiveH2B  <j.  leftH2B 

3.  leftH2B  <”  naiveH2B 

These  indicate  that  (1)  leftH2B  is  more  specific  than  naiveH2B,  (2)  leftH2B  is 
supported  by  instruction  and  naiveH2B  is  not,  and  (3)  naiveH2B  was  present  prior  to  reading, 
and  lef  tH2B  was  not.  The  simulation  next  automatically  constructs  and  evaluates  explanations 
for  new  explanandums  leftH2B  and  rightH2L.  We  describe  how  leftH2Bis  explained. 

Since  our  discussion  focuses  on  a  simulation  trial  with  the  “no  loop”  model,  the  network 
contains  only  an  explanation  for  naiveH2B,  as  in  Figure  44(a).  To  explain  leftH2B,  the 
simulation  invokes  justify-explanandum  using  leftH2B  as  the  explanandum  argument.  This 
constructs  an  explanation  for  leftH2B  using  knowledge  about  1-heart  from  the  first  scenario 
microtheory.  This  explanation  x\  is  shown  in  Figure  44(b),  coexisting  with  the  explanation  xo  for 
naiveH2B.  Notice  that  in  Figure  44(b),  some  of  the  preferences  computed  above  are  shown. 
Moreover,  since  the  explanandum  leftH2B  is  preferred  for  specificity  over  naiveH2B,  any 
explanation  for  leftH2B  (e.g.,  new  explanation  xi)  also  explains  naiveH2B.  This  is  reflected  in 
Figure  44(b). 

According  to  our  discussion  of  explanation  competition  in  Chapter  4,  any  two  explanations 
that  explain  the  same  explanandum(s)  are  in  competition.  In  Figure  44(b),  xo  and  x\  both  explain 
naiveH2B,  so  rule-based  preferences  are  used  to  compute  preferences  between  xo  and  xj.  These 
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preferences  and  a  preference  aggregation  function  will  determine  which  explanation  will  be 
assigned  to  naiveH2B  in  the  explanandum  mapping  IE.  The  following  preferences  are  computed 
as  follows,  using  the  above  domain-level  preferences  already  discussed  above.  Let  Cheart  be  the 
ContainedFluid  instance  with  participants  (?sub,  blood)  and  (?con,  heart),  and  let  c/e/<  be 
the  ContainedFluid  instance  with  participants  (?sub,  blood)  and  (icon,  1-heart). 
Similarly,  let f heart  be  the  FluidFlow  instance  with  binding  (?source-con,  heart),  and  let fieft 
be  the  FluidFlow  instance  with  binding  (?source-con,  1-heart). 

1.  C heart  ^-rnfi  Cleft 

2.  Cheart  ^ mfi  Cleft 

3.  Jl 

•  Cleft  ^ mfi  Cheart 
4.  f heart  "^mf  if  left 
5.  f heart  ^mf  if  left 
6.  fieft  <~mf  if  heart 

These  preferences  over  model  fragment  instances  are  used  to  compute  three  explanation- 
level  preferences  between  the  prior  heart-to-body  explanation  xo  and  the  new  left-heart-to-body 
explanation  xf. 


1. 

2. 

3. 


Xo  1 


Xo  <xpX  1 


X|  <xpX 0 
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One  of  these  explanations  must  be  mapped  to  naiveH2B  as  its  preferred  explanation  in  the 
explanandum  mapping  IE;  however,  the  three  explanation- level  preferences  above  describe  a 
preference  cycle.  Cycles  are  resolved  using  a  preference  aggregation  function,  as  described  in 
Chapter  4.  The  preference  aggregation  function  is  given  a  preference  ranking  which  is  an 
ordering  over  preference  dimensions  {5,  i,  n,  r),  where  a  dimension  earlier  in  the  ordering  is 
more  important  than  a  dimension  later  in  the  ordering.  The  aggregation  function  begins  with  the 
first  dimension  of  the  preference  ranking  and  honors  those  preferences,  and  then  honors  each  of 
the  preferences  in  the  next  dimension  as  long  as  it  does  not  create  a  cycle,  and  so-on  for  all 
dimensions.  If  n  precedes  5  and  i  in  the  preference  ranking,  the  system  will  prefer  x0  over  x\; 
otherwise,  xi  will  be  preferred.  This  ultimately  determines  which  explanation  will  be  mapped  to 
naiveH2B  in  IE,  and  thereby  affects  how  the  system  will  explain  blood  flow  on  the  posttest. 

The  preference  ranking  also  applies  to  explanandums:  if  n  precedes  s  and  i  in  the  preference 
ranking,  then  (based  on  the  above  preferences)  the  explanandum  naiveH2B  will  be  preferred 
over  lef  tH2B.  Explanandum  preferences  are  used  for  pruning  -  if  explanandum  a  is  preferred 
over  explanandum  b  then  explanandum  b  is  not  used  for  problem  solving,  question  answering,  or 
generating  an  influence  graph. 

Since  preferences  are  computed  over  entities  and  model  fragment  instances,  the  preference 
ranking  ultimately  affects  the  granularity  and  terminology  of  the  explanation.  For  example,  if 
the  prior  knowledge  preference  n  is  first,  the  system  will  prefer  pre-instructional  entities  (e.g., 
heart)  over  more  specific  entities  and  regions  thereof  (e.g.,  left-heart,  right-heart,  left-ventricle, 
right-ventricle,  left-atrium,  and  right-atrium),  and  this  will  be  reflected  in  the  choice  of 
explanations.  This  is  an  example  of  how  we  can  model  resistance  to  change:  favoring  pre- 
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instructional  entities  over  new  entities  whenever  a  choice  is  available  makes  the  system 
selectively  incorporate  new  information  into  its  qualitative  models.  Conversely,  if  instruction  (?) 
is  first  in  the  preference  ranking,  then  textbook  infonnation  will  displace  pre-instructional 
information  in  preferred  explanations. 

As  mentioned  above,  the  preference  ranking  is  an  input  to  the  simulation,  so  each  trial  has  a 
single  preference  ranking  that  it  uses  throughout  learning.  By  varying  this  preference  ranking, 
we  can  change  the  outcome  of  learning  and  thereby  simulate  different  students,  including 
individual  differences.  In  this  simulation  the  preference  ranking  is  an  approximation  of  a 
student’s  learning  strategy.  Recall  that  some  students  in  the  control  group  who  started  with  the 
same  mental  model  in  the  pretest  diverged  in  their  mental  models  at  the  posttest.  As  we  will 
show,  the  preference  ranking  helps  account  for  these  differences. 

We  have  described  how  our  computational  model  integrates  new  infonnation  by 
constructing  explanations  and  computing  preferences.  The  content  of  each  scenario  microtheory 
varies,  but  the  explanation  construction  and  evaluation  mechanisms  are  constant. 


215 


Legend 

Jo  (isa  heart  Heart) 

fl  (physicallyContains  heart  Blood) 

f2  (isa  Blood  StuffType) 

fs  (isa  body  WholeBody) 

f4  (physicallyContains  body  Blood) 

mfo  (isa  ContainedFluid  Model Fragment ) 
fs  (greaterThan  (Amount  Blood  heart)  0) 

fo  (isa  mfiO  ContainedFluid) 

f7  (substanceOf  mfiO  Blood) 

fg  (containerOf  mfiO  heart) 

mfi  (isa  FluidFlow  ModelFragment) 


fl5  (isa  mfi2  FluidFlow) 
fl6  ( f romLocation  mfi2  mfiO) 

fjy  (toLocation  mfi2  mfil) 

f22  (describes  mfil  naiveH2B) 

f23  (isa  naiveH2B  PhysicalTransfer) 

J "24  (substanceOf  naiveH2B  Blood) 

f25  (outOf-Container  naiveH2B  heart) 
f26  (into-Container  naiveH2B  body) 

fsi  (isa  1-heart  (LeftRegionFn  heart) ) 
f32  (physicallyContains  1-heart  Blood) 


Figure  44:  Portion  of  explanation-based  network,  (a):  After  explaining  blood  flow  from  heart  to  body 
( naiveH2B ).  (b):  After  explaining  blood  flow  from  the  left-heart  to  the  body  ( leftH2B ),  with  preferences  across 
concepts  (<c),  model  fragment  instances  (<mn),  and  explanations  (<xp). 


7.2.5  Assuming  model  participants 


In  some  cases,  an  explanandum  is  presented  to  the  system  when  the  system  does  not  have 
complete  infonnation.  Consider  the  sentence  “Blood  returning  to  the  heart  [from  the  body], 
which  has  a  high  concentration  of  carbon  dioxide  and  a  low  concentration  of  oxygen,  enters  the 
right  atrium.”  The  corresponding  scenario  microtheory  contains  the  following  statements: 


(isa  bloodToAtrium-Right  FlowingFluid) 
(substanceOf  bloodToAtrium-Right  Blood) 


(outOf-Container  bloodToAtrium-Right  body) 
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(into-Container  bloodToAtrium-Right  right-atrium) 

(valueOf  (  (ConcentrationOf Fn  Oxygen)  bloodToAtrium-Right) 
(LowAmountFn  (ConcentrationOf Fn  Oxygen) ) ) 

(valueOf  ( (ConcentrationOf Fn  CarbonDioxide)  bloodToAtrium-Right) 
(HighAmountFn  (ConcentrationOf Fn  CarbonDioxide))) 


From  this  description  of  the  blood  that  flows  from  the  body  to  the  right  atrium,  the  system 
can  gather  most  of  the  participants  of  a  FluidFlow:  the  substance  is  blood;  the  source  container 
is  the  body;  the  destination  container  is  the  right  atrium;  and  the  ContainedFluid  instances 
corresponding  to  these  containers  are  the  source  and  destination  fluids.  However,  no  entity  is 
included  in  this  scenario  microtheory  that  conforms  to  the  collection  and  constraints  of  the 
?path  participant  of  this  FluidFlow,  and  the  agent  may  not  know  of  any  entity  that  pennits 
blood  flow  from  the  body  to  the  right  atrium. 

As  discussed  in  section  4.4,  the  model  formulation  algorithm  assumes  the  existence  of 
entities  to  fill  these  participant  slots.  When  explaining  the  situation  bloodToAtrium-Right, 
suppose  the  model  fonnulation  algorithm  cannot  bind  a  known  entity  to  the  ?path  participant 
slot  which  corresponds  to  the  role  along-Path  of  FluidFlow  (see  Figure  41  for  details).  The 
algorithm  still  creates  a  new  FluidFlow  model  fragment  instance  with  a  unique  symbol  such  as 
mf  i5,  and  will  construct  an  entity  with  a  skolem  term  (discussed  in  Chapter  5)  such  as 
(SkolemParticipant  mfi5  along-Path).  This  indicates  that  this  entity  was  assumed  as  a 
participant  of  mf  i5  for  the  role  along-Path.  The  following  two  assertions  are  inferred  as  well: 


(isa  (SkolemParticipant  mfi2  along-Path)  Path-Generic) 


(permitsFlow  (SkolemParticipant  mfi5  along-Path)  Blood  body  right-atrium) 
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These  statements  describing  the  assumed  entity  will  be  part  of  the  resulting  explanation.  This 
allows  the  system  to  construct  qualitative  models  with  partial  information. 

7.3  Simulation  results 

Here  we  describe  the  results  of  our  simulation.  Each  trial  of  our  simulation  varied  across  three 
parameters:  (1)  the  system’s  starting  model,  one  of  the  six  shown  in  Figure  39;  (2)  whether  or  not 
the  system  constructs  explanations  for  new  explanandums,  and  (3)  the  preference  ranking. 
Varying  the  latter  two  settings  makes  two  psychological  assumptions  which  we  discuss  in  the 
next  section. 

Each  trial  proceeds  in  the  same  fashion:  (1)  validate  the  starting  (pretest)  model  with 
influence  graphs;  (2)  incorporate  the  textbook  information  via  a  sequence  of  scenario 
microtheories  as  described  above;  and  (3)  determine  the  ending  (posttest)  model  with  influence 
graphs. 

The  results  are  shown  in  Figure  45.  Each  node  in  the  figure  corresponds  to  a  student 
circulatory  model  in  Figure  39,  and  each  labeled  arrow  between  circulatory  models  indicates  that 
the  simulation  transitions  from  one  model  to  another  using  the  labeled  preference  ranking.  For 
instance,  by  engaging  in  full  self-explanation  with  preference  ranking  ( s/i ,*,*,*)  (i.e.,  the  last 
three  preferences  are  irrelevant,  provided  the  first  is  either  instruction  or  specificity),  the 
simulation  could  transition  to  the  correct  “double  loop  (2)”  circulatory  model  from  any  initial 
model.  Further,  recall  that  using  ranking  («,*,*,*)  biases  the  system  to  favor  explanations  that 
use  prior  (i.e.,  starting  model)  entities,  such  as  heart,  over  comparable  entities  encountered  via 
instruction,  such  as  left-ventricle.  This  resulted  in  the  simulation  learning  the  most 
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popular  final  model  in  Chi’s  control  group,  “double  loop  (1)”  (Figure  43,  left).  This  mode  uses 
heart  instead  of  the  more  specific  regions  of  the  heart  used  in  “double  loop  (2)”  (Figure  43, 
right).  By  disabling  explanation  construction  (Figure  45,  0),  the  system  always  remained  at  its 
initial  circulatory  model. 

Individual  differences  in  the  control  group  were  modeled  using  different  preference 
orderings  (n***)  (4  students),  0  (3  students),  and  ( s/i ,*,*,*)  (2  students).  The  prompted 
students  were  modeled  using  preference  ordering  ( s/i ,*,*,*)  (8  students)  and  («,*,*,*)  (2 
students).  The  remaining  two  prompted  students  were  not  modeled  by  the  system.  Both 
transitioned  to  the  “single  loop  (lung)”  model  -  one  from  “no  flow”  and  one  from  “single  loop.” 
The  inability  of  our  system  to  generate  these  transitions  may  be  due  to  representation  differences, 
either  in  the  starting  knowledge  or  in  the  representation  of  the  instructional  passage.  We  discuss 
this  further  in  the  next  section. 

By  varying  the  initial  circulatory  model,  the  preference  rankings,  and  whether  or  not  the 
system  constructs  explanations,  the  system  was  able  to  capture  19  out  of  21  (>90%)  of  student 
model  transitions  in  the  psychological  data.  Individual  differences  in  the  control  group  were 


(n,  *,  *,  *),  0 


Figure  45:  Circulatory  model  transitions  for  all  simulation  trials. 
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captured  by  three  parameter  settings,  and  the  majority  of  the  prompted  group  was  modeled  by 
encoding  a  preference  for  explanations  that  contained  specific  and  instructional  concepts, 

( s/i ,*,*,*}. 

7.4  Discussion 

We  have  simulated  self-explanation  using  model  formulation,  metareasoning,  and  epistemic 
preferences.  By  altering  its  preference  rankings,  we  are  able  to  affect  how  the  system  prioritizes 
its  knowledge  and  integrates  new  information. 

Our  simulation  trials  vary  with  respect  to  (1)  whether  the  system  explains  textbook 
information,  and  (2)  the  preference  ranking  it  uses  to  evaluate  explanations.  Since  our  model 
leams  by  explaining,  changing  setting  (1)  to  disable  explanation  construction  prohibits  learning. 
This  means  that  some  simulation  trials  will  not  integrate  any  textbook  information,  which 
therefore  assumes  that  some  students  do  not  learn  from  reading  the  textbook  passage.  This  was 
indeed  the  case  for  students  in  Chi  et  al.’s  control  group,  since  two  students  in  Figure  40  started 
and  ended  with  the  same  incorrect  model. 

Varying  the  preference  ranking  assumes  that  students  have  different  strategies  for 
assimilating  infonnation  from  text.  This  must  be  the  case,  because  we  cannot  explain  the 
learning  patterns  of  the  control  group  in  Figure  40  based  on  their  starting  model  alone:  of  the 
three  students  in  the  control  group  who  began  with  the  “single  loop”  model,  two  of  them 
transitioned  to  “double  loop  (1),”  and  one  transitioned  to  “double  loop  (2).”  Consequently,  the 
system  must  capture  these  individual  differences  with  at  least  two  different  learning  strategies, 
which  we  model  using  preference  rankings. 
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For  autonomous  learning  systems  and  for  modeling  human  learning  over  multiple  reading 
tasks,  the  preference  ranking  might  need  to  be  more  dynamic,  reflecting  depth  of  experience 
versus  the  credibility  of  the  source.  Nevertheless,  the  simulation  demonstrates  good  coverage  of 
the  psychological  data. 

We  have  shown  that  the  space  of  preference  rankings  ( s/i ,*,*,*)  results  in  the  correct  model 
from  any  initial  model  for  this  task.  This  may  not  be  the  case  for  other  domains  and  for  students 
whose  mental  models  are  flawed  by  containing  extraneous  components.  For  instance,  in  this 
study,  textbook  entities  (e.g.,  left-ventricle)  were  generally  at  least  as  specific  as  the 
entities  in  students’  initial  models  (e.g.,  heart).  This  means  that  the  partial  orderings  <sc  and 
<lc  had  a  near-perfect  correspondence  over  entities.39  We  can  imagine  other  cases  where  this 
might  not  be  true.  For  example,  a  student  may  have  erroneous  initial  beliefs  about  a  left- 
ventricle-basin  region  of  the  left-ventricle.  Since  this  region  does  not  actually  exist, 
the  initial,  incorrect  entity  is  more  specific  than  the  instructional  entity.  Any  preference  ranking 
that  places  specificity  before  instruction,  such  as  (s, *,*,*),  would  retain  the  left-ventricle- 
basin  misconception  in  the  posttest.  The  opposite  would  be  true  if  instruction  is  ranked  over 
specificity. 

This  simulation  supports  our  hypothesis  that  constructing  and  evaluating  explanations  can 
model  the  benefits  of  self-explanation.  Additionally,  the  knowledge  representation  was 
sufficient  to  explain  the  flows  of  blood,  CCF,  and  CF  in  the  pretests  and  posttests  in  ways  that  are 
compatible  with  students’  explanations,  so  that  the  system’s  qualitative  models  are  comparable 


39  Specificity  and  instruction  do  not  overlap  perfectly  in  this  study.  Consider  a  student  who  already  knows  about  the 
left  atrium  and  left  ventricle  (the  two  sub-regions  of  the  left  heart):  when  they  read  about  the  left  heart  early  in  the 
text,  the  entities  in  their  initial  mental  model  are  temporarily  more  specific  than  the  entities  in  the  textbook  model. 
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to  students’  mental  models.  This  provides  evidence  for  our  claim  that  compositional  qualitative 
models  can  simulate  human  mental  models. 

While  our  methods  were  sufficient  to  simulate  the  majority  of  the  students,  two  of  the 
students  in  the  self-explanation  group  were  not  captured.  These  students  both  used  the  “single 
loop  (lung)”  model  at  the  posttest  -  one  transitioned  there  from  the  “no  flow”  model  and  the 
other  from  the  “single  loop”  model.  This  suggests  that  our  model  of  self-explanation  is 
incomplete.  These  students  might  have  hypothesized  system  components  based  on  the  function 
of  the  system.  If  informed  that  (1)  the  lungs  oxygenate  the  blood  and  that  (2)  the  purpose  of  the 
circulatory  system  is  to  provide  the  body  with  oxygen  and  nutrients,  one  might  infer  that  blood 
flows  directly  from  the  lungs  to  the  body. 

In  our  simulation,  self-explanation  generates  network  structure  and  preferences,  which 
makes  new  knowledge  available  for  later  problem-solving.  When  we  disabled  self-explanation 
(Figure  45,  0),  the  new  knowledge  was  unavailable  for  later  use. 

We  have  shown  how  existing  models  are  recombined  to  explain  new  situations  and 
accommodate  new  information.  This  has  simulated  how  people  revise  and  reason  with  mental 
models:  new  domain  elements  are  acquired  through  simulated  instruction,  and  conceptual  change 
is  achieved  by  combining  elements  of  domain  knowledge  into  new,  preferred  models.  This  does 
not  account  for  the  revision  of  categories  and  model  fragments  themselves.  We  simulate  this 
type  of  conceptual  change  in  the  next  chapter. 
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Chapter  8:  Revising  a  category  of  force  when  explanations  fail 

Naive  theories  of  force  are  some  of  the  most  widely-studied  misconceptions,  and  are  also  some 
of  the  most  resilient  to  change.  The  questions  of  how  intuitive  theories  of  force  are  learned, 
represented,  and  revised  are  debated  in  the  literature,  but  there  is  some  agreement  that  they  are 
mechanism-based  (McCloskey,  1983;  Ioannides  &  Vosniadou,  2002;  diSessa  et  ah,  2004)  and 
learned  and  reinforced  by  experience  (Smith,  diSessa,  &  Roschelle,  1994). 

Here  we  describe  a  simulation  that  creates  and  revises  a  force-like  category  to  explain  a 
sequence  of  observations.40  Categories  and  model  fragments  are  created  and  revised  upon 
explanation  failure.  After  each  observation,  the  system  completes  a  questionnaire  from  previous 
psychology  experiments  (Ioannides  &  Vosniadou,  2002;  diSessa  et  al.,  2004)  so  we  can  compare 
its  answers  to  those  of  students.  We  then  plot  the  system’s  learning  trajectory  against  student 
data  to  show  that  the  simulation  can  leam  and  transition  between  student-like  categories  of  force. 
The  system  transitions  between  mutually  inconsistent  specifications  of  a  force-like  category, 
along  a  humanlike  trajectory.  This  simulation  thereby  provides  evidence  for  claims  1  and  3  of 
this  dissertation: 

Claim  1 :  Compositional  qualitative  models  provide  a  consistent  computational  account  of 
human  mental  models. 

Claim  3:  Human  mental  model  transformation  and  category  revision  can  both  be  modeled 
by  iteratively  (1)  constructing  explanations  and  (2)  using  meta-level  reasoning  to  select 
among  competing  explanations  and  revise  domain  knowledge. 

40  This  builds  upon  the  simulation  described  in  Friedman  and  Forbus  (2010). 
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Like  the  simulations  in  Chapters  6  and  7,  this  simulation  constructs  and  evaluates 
explanations  to  simulate  human  conceptual  change.  However,  this  simulation  also  uses 
heuristics  to  revise  its  model  fragments  and  categories  when  it  fails  to  explain  an  explanandum, 
and  then  it  attempts  explanation  again.  This  conforms  to  the  following  pattern  of  events: 

1 .  A  new  explanandum  within  a  scenario  requires  an  explanation. 

2.  No  explanation  can  be  constructed  that  is  consistent  with  the  scenario.  We  call  this  an 
explanation  failure. 

3.  The  system  finds  heuristics  that  are  applicable  to  the  present  failure  mode. 

4.  Applicable  heuristics  are  sorted  by  their  estimated  complexity  of  change  to  domain 
knowledge. 

5.  Beginning  with  the  heuristic  that  incurs  the  least  change,  execute  the  heuristic  to  add  or 
revise  domain  knowledge  as  necessary.  If  explanation  still  fails,  repeat  with  the  next 
heuristic. 

After  each  explanandum  within  a  scenario  is  explained,  MAC/FAC  is  used  to  retrieve  a 
similar,  previously  explained  scenario.  If  the  two  scenarios  are  sufficiently  similar, 
discrepancies  are  detected  between  the  new  and  previous  scenario,  and  are  explained  using  the 
same  process  as  above,  using  heuristics  to  revise  knowledge  as  necessary.  We  describe  both  of 
these  explanation-driven  processes  of  change  in  detail  below.  First,  we  outline  the  results  of 
Ioannides  &  Vosniadou  (2002)  and  diSessa  et  al.  (2004),  which  serve  as  the  bases  for 


comparison  in  this  simulation. 
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8.1  Assessing  the  changing  meaning  of  force  in  students 

Ioannides  &  Vosniadou  (2002)  conducted  an  experiment  to  assess  students’  ideas  of  force.  They 
used  a  questionnaire  of  sketched  vignettes  which  asked  the  student  about  the  existence  of  forces 
on  stationary  bodies,  bodies  being  pushed  by  humans,  and  bodies  in  stable  and  unstable 
positions.  They  concluded  that  several  meanings  of  force  were  held  by  the  students: 

1.  Internal  Force  (11  students):  A  force  exists  inside  all  objects,  affected  by  size/weight. 

2.  Internal  Force  Affected  by  Movement  (4  students):  Same  as  Internal  Force,  but 
position/movement  also  affects  the  amount  of  force. 

3.  Internal  &  Acquired  (24  students):  A  force  exists  due  to  size/weight,  but  objects 
acquire  additional  force  when  set  into  motion. 

4.  Acquired  (18  students):  Force  is  a  property  of  objects  that  are  in  motion.  There  is  no 
force  on  stationary  objects. 

5.  Acquired  &  Push-Pull  (15  students):  Same  as  (4),  but  a  force  exists  on  an  object, 
regardless  of  movement,  when  an  agent  pushes  or  pulls  it. 

6.  Push-Pull  (1  student):  A  force  only  exists  when  objects  are  pushed  or  pulled  by  an 
agent. 

7.  Gravity  &  Other  (20  students):  Forces  of  gravity,  of  push/pull,  and  acquired  force  when 
objects  are  moving. 
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8.  Mixed  (12  students):  Responses  were  internally  inconsistent,  and  did  not  fall  within  the 
other  categories. 


Meaning  of  Force 

K 

4th 

6th 

9th 

Total 

Internal 

7 

4 

11 

Intemal/Movement 

2 

2 

4 

Intemal/Acquired 

4 

10 

9 

1 

24 

Acquired 

5 

11 

2 

18 

Acquired/Push-Puli 

5 

10 

15 

Push-Pull 

1 

1 

Gravity/Other 

3 

1 

16 

20 

Mixed 

2 

6 

4 

12 

Figure  46:  Occurrences  of  meaning  of  force,  by  grade. 


The  frequencies  of  responses  by  grade  are  listed  in  Figure  46.  Though  these  data  were 
gathered  on  different  students  across  grades,  they  illustrate  a  trend:  Kindergarteners  favor  the 
“Internal”  meaning  of  force,  and  then  transition  through  the  “Internal  &  Acquired”  meaning  to 
the  “Acquired”  meaning.  By  grade  9,  students  tend  to  adopt  the  “Acquired  &  Push-Pull”  and 
“Gravity  &  Other”  meanings. 

diSessa  et  al.  (2004)  conducted  a  replication  of  Ioannides  &  Vosniadou  (2002)  using  a 
modified  questionnaire,  but  was  not  able  to  reliably  classify  students  using  the  same  coding 
criteria.  diSessa  et  al.’s  conclusions  include:  (1)  students  do  not  fonn  and  transition  between 
coherent  theories  (cf.  Ioannides  &  Vosniadou,  2002);  (2)  rather,  student  theories  are  composed  of 
small,  contextualized,  pieces  of  knowledge,  some  of  which  are  idiosyncratic;  and  therefore  (3) 
classifying  each  student  into  one  of  several  coherent  theories  does  not  help  us  understand  the 
processes  by  which  students  use  and  revise  conceptual  knowledge.  diSessa  et  al.’s  conclusions 
are  consistent  with  the  knowledge  in  pieces  perspective  discussed  in  Chapter  2. 
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Despite  this  controversy,  the  student  data  from  Ioannides  &  Vosniadou’s  study  provides  a 
clear  basis  for  comparison  for  our  simulation.  Additionally,  since  our  approach  incorporates 
ideas  from  both  knowledge  in  pieces  (i.e.,  we  represent  domain  knowledge  with  globally 
incoherent,  composable  elements)  and  theory  theory  (i.e.,  explanations  are  coherent  aggregates 
of  said  elements),  we  have  the  opportunity  to  demonstrate  how  an  agent  with  globally  incoherent 
domain  knowledge  can  transition  through  a  trajectory  of  apparently  coherent  meanings  of  force. 
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Figure  47:  At  left:  a  four- frame  comic  graph  used  as  training  data. 
At  right:  five  of  the  ten  questionnaire  scenarios  used  as  testing  data. 


Ioannides  &  Vosniadou  and  diSessa  et  al.  both  used  a  sketch-based  questionnaire  to 


characterize  each  student’s  concept  of  force.  Ioannides  &  Vosniadou’s  questionnaire  varied 


slightly  from  diSessa  et  al.’s  version,  so  we  used  the  more  recent  and  succinct  (diSessa  et  al.) 


variation.  The  questionnaire  contains  ten  scenarios,  five  of  which  are  illustrated  in  Figure  47 
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(right).  Each  scenario  contains  two  sketches  (A  and  B)  of  a  person  and  a  rock,  and  the  student  is 
asked  three  questions: 

1 .  What  forces  act  on  rock  A? 

2.  What  forces  act  on  rock  B? 

3.  Is  the  force  on  rock  A  same  or  different  as  the  one  on  rock  B? 

One  or  more  aspects  vary  between  the  A  and  B  sketch  within  a  scenario  (e.g.,  the  size  of  the 
rock,  the  size  of  the  person,  and  the  motion  of  the  rock).  This  helps  identify  which  variables 
determine  the  existence  and  magnitude  of  force,  which  ultimately  determines  the  student’s  ideas 
of  force. 

8.1.1  Replicating  the  force  questionnaire  and  approximating  students’  observations 

We  sketched  the  questionnaire  from  diSessa  et  al.  using  CogSketch  (illustrated  in  Figure  47, 
right).  We  use  sketched  annotations,  as  described  in  Chapter  5,  to  indicate  pushing  (blue  arrows) 
and  movement  (green  arrows)  as  indicated  in  the  original  questionnaire.  We  use  the  same  coding 
strategy  as  diSessa  et  al.  and  Ioannides  &  Vosniadou  to  classify  our  simulation’s  meaning  of 
force.  To  simplify  coding,  we  interpret  diSessa  et  al.’s  question  (3)  as: 

3.  Which  rock  has  greater  force(s)  acting  on  it,  if  they  are  comparable? 

This  allows  us  to  query  for  an  ordinal  relationship  (e.g.,  greaterThan,  lessThan,  or 
equal  To)  between  the  quantities  of  force  on  the  rocks. 
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We  also  use  CogSketch  to  sketch  comic  graphs  (see  Figure  47,  left),  which  are  used  as 
training  data.  These  comic  graphs  are  similar  to  those  in  Chapter  5,  except  they  contain  no 
annotations.  Consequently,  the  system  has  to  detect  motion  and  infer  force-like  quantities 
independently.  As  mentioned  above,  the  entire  sketched  questionnaire  is  interleaved  after  each 
comic  graph  training  datum  to  detennine  which,  if  any,  student  meaning  of  force  in  Figure  46  is 
used  by  the  system.  Each  simulation  trial  thus  generates  a  sequence  of  force  categories.  We  can 
plot  this  sequence  of  force  categories  against  the  student  data  in  Figure  47  to  detennine  whether 
the  system’s  trajectory  of  learning  follows  a  pattern  within  the  results  of  Ioannides  &  Vosniadou 
(2002). 

We  next  discuss  how  comic  graphs  are  processed  and  explained  by  the  simulation,  and  how 
heuristics  are  used  to  revise  knowledge  upon  failure. 

8.2  Learning  by  explaining  new  observations 

When  the  simulation  is  given  a  new  comic  graph  as  a  training  datum,  it  detects  all  quantity 
changes  in  the  comic  graph,  such  as  movements  along  the  x-axis.  These  quantity  changes  are 
explanandums,  so  the  simulation  must  explain  why  each  quantity  change  starts,  persists,  and 
stops.  If  no  explanation  can  be  constructed  that  is  consistent  within  the  scenario,  then  the  system 
revises  its  domain  knowledge  until  all  quantity  changes  can  be  explained.41  We  discuss  these 
operations  in  the  order  in  which  they  occur,  using  the  comic  graph  shown  in  Figure  47(left)  to 
illustrate. 


41  When  people  encounter  anomalies,  they  can  ignore  them  altogether  (Feltovich  et  al.,  2001),  hold  them  in 
abeyance,  or  exclude  them  from  their  domain  theory  (Chinn  &  Brewer,  1998).  Our  simulation’s  only  response  to 
anomaly  is  revision,  so  we  expect  rapid  transition  between  concepts  of  force.  We  address  this  in  the  discussion 
section  of  this  chapter. 
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Heuristic  createDecreaseProcess 
Participants : 

?obj  Entity 
?q  Quantity 
Constraints : 

(decreasing  (?q  ?obj ) ) 

Consequences : 

(isa  Pprocess  ModelFragment) 

(revise  Pprocess  (addParticipant  ?e  Entity) ) 

(revise  Pprocess  (addConsequence  (>  (Rate  ?self)  0)) 
(revise  Pprocess  (addConsequence  (i-  ( ?q  ?e)  (Rate  ?self ) ) 


ModelFragment  rrq 
Participants : 

?e  Entity 
Constraints : 
nil 

Conditions : 
nil 

Consequences : 

(>  (Rate  ?self)  0) 
(i-  (x-pos  ?e) 

(Rate  ?self ) ) 


Figure  48:  Left:  a  heuristic  createDecreaseProcess  that  automatically  creates  a  new 
model  fragment  to  explain  a  quantity  decreasing.  Right:  Process  model  of  leftward 
movement  automatically  created  with  this  heuristic. 

The  simulation  first  finds  quantity  changes  by  comparing  adjacent  subsketches  (e.g., 
subsketches  3  and  4  in  Figure  47,  left)  using  the  spatial  quantities  encoded  by  CogSketch.  If  a 
quantity  varies  over  a  constant  threshold  (to  account  for  unintentional  jitter  while  sketching),  a 
quantity  change  is  encoded  over  that  quantity  for  the  transition.  For  example,  in  the  2->3  and 
3->4  transitions,  the  x-coordinate  of  the  ball  decreases.  Once  the  system  computes  all  quantity 
changes  within  a  comic  graph,  it  must  explain  why  each  quantity  change  begins  and  ends. 

For  our  discussion,  suppose  the  simulation  is  explaining  the  ball’s  movement  as  seen  in  the 
transitions  2->3->4.  Suppose  also  that  this  is  the  first  comic  graph  that  the  system  has 
encountered.  Since  the  simulation  begins  with  no  model  fragments  and  no  explanations,  it  will 
fail  to  explain  the  ball’s  movement.  Heuristics  are  used  to  revise  and  extend  domain  knowledge 
in  order  to  accommodate  this  observation. 


8.2.1  Declarative  heuristics  for  failure-based  revision 


Like  model  fragments,  heuristics  are  declarative.  This  means  that  the  system  can  inspect  them  in 
order  to  decide  which  to  use.  To  illustrate  why  this  is  important,  suppose  that  the  system  is 
unable  to  explain  an  object’s  motion,  and  two  heuristics  apply  to  the  situation:  (1)  revise  an 
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existing  model  fragment  by  adding  a  statement  to  its  conditions;  or  (2)  hypothesize  a  new, 
unobservable  category  that  causes  objects  to  resist  motion,  and  then  revise  a  model  fragment  to 
account  for  this.  Which  heuristic  should  the  system  choose?  They  psychology  literature 
suggests  that  students  make  minimal  changes  to  their  theories  when  confronted  with  anomalous 
data  (Chinn  and  Brewer,  1993),  so  our  system  makes  the  minimal  change  possible.  It  inspects 
heuristics  to  rate  the  amount  of  change  they  will  incur,  and  sort  them  accordingly.  Heuristics  are 
defined  using  similar  vocabulary  as  model  fragments.  Figure  48  (left)  shows  one  such  heuristic 
used  by  the  system,  which  we  will  describe  within  our  example. 

Continuing  our  example  in  the  previous  section,  suppose  the  simulation  is  given  the  comic 
graph  of  the  foot  kicking  the  ball  to  the  left  in  Figure  47(left),  and  must  explain  the  ball  moving. 
Since  the  system  begins  without  any  model  fragments  or  explanations,  it  fails  to  explain  the 
ball’s  movement.  It  finds  applicable  heuristics  by  testing  the  participants  and  constraints  of  the 
heuristics.  The  heuristic  createDecreaseProcess  in  Figure  48  (left)  applies  to  this  situation, 
since  a  quantity  ?q  of  an  entity  ?obj  is  decreasing  (i.e.,  a  ball’s  x-axis  position  is  decreasing). 
The  consequences  of  this  heuristic  (1)  add  a  new,  empty  model  fragment  ?process  to  the 
domain  knowledge  microtheory  and  (2)  revise  ?process  so  that  it  describes  the  corresponding 
quantity  ?q  of  an  entity  ?e  decreasing.  This  produces  the  process  model  m /  (Figure  48,  right) 
which  describes  an  object  moving  to  the  left.  The  ball’s  leftward  movement  can  now  be 
explained  using  mi. 
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Figure  50  (a)  Model  fragment  m2  (Figure  48,  right)  explains  the  ball  moving,  but  not  the  ball 
stopping,  (b)  After  revising  m,  as  m2  (Figure  49,  right),  m2  explains  both  phenomena,  and 

preferences  are  computed. 

The  system  now  has  a  rudimentary  model  fragment  mi  which  describes  objects  -  actually, 
all  objects  -  moving  continually  to  the  left.  The  resulting  network  is  shown  in  Figure  50(a). 

Next,  the  system  must  explain  why  the  ball  stops  moving.  Provided  only  model  fragment  m2, 
this  is  not  possible.  The  system  must  revise  its  knowledge  again  to  resolve  this  next  failure. 

This  revision  is  illustrated  in  Figure  49,  where  the  heuristic  addHiddenQtyCond  is  used  to 
revise  mj  as  m2.  The  participants  and  constraints  of  heuristic  addHiddenQtyCond  assert  that  it 


Heuristic  addHiddenQtyCond 
Participants : 

?s  CurrentState 
?p  Processlnstance 
?t  ProcessType 
Constraints : 

(startsAfterEndingOf  ?s  ?p) 

(isa  ?p  ?t) 

Consequences : 

(exists  ?cq) 

(isa  ?cq  ConceptualQuantity) 
(revise  ?t  (addQtyCondition  ?cq) ) 


ModelFragment  m2 
Participants : 

?e  Entity 
Constraints : 
nil 

Conditions : 

(>  (q  ?e)  0) 

Consequences : 

(>  (Rate  ?self)  0) 

(i-  (x-pos  ?e)  (Rate  ?self) ) 


Figure  49:  Left:  a  heuristic  addHiddenQtyCond  that  revises  process  models  by  adding  a 

hidden  (conceptual)  quantity. 

Right:  m2,  the  result  of  revising  mi  (Figure  48,  right)  with  addHiddenQtyCond.  Hidden 
quantity  q,  a  placeholder  force-like  quantity,  is  revisable  by  other  heuristics. 
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is  applicable  when  some  process  model  (e.g.,  mi)  ends  before  the  current  state.  The 
consequences  of  the  heuristic  (1)  assert  the  existence  of  a  new,  hidden  quantity  ?cq  and  (2) 
revise  the  conditions  of  the  model  fragment  (mi)  to  require  the  existence  of  ?cq.  Consider  that 
the  system  generated  the  ground  symbol  q  to  represent  the  conceptual  quantity  ?cq.  The  result 
is  model  fragment  nij,  which  describes  things  moving  when  they  have  q  at  a  rate  qualitatively 
proportional  to  their  q.  Hidden  conceptual  quantities,  such  as  q,  are  categories  that  are  not 
observable  in  a  scenario,  and  their  existence  is  inferred  via  the  conditions  and  consequences 
model  fragments.  The  network  after  applying  the  heuristic  addHiddenQtyCond  and 
explaining  the  ball  stopping  is  shown  in  Figure  50(b).  This  includes  the  new  quantity  q  and  a 
preference  m j  <c  hit.  Note  that  the  previous  model  in /  still  exists  in  the  system  -  instead  of 
directly  revising  the  model  fragment  m  /  into  m2,  the  system  copies  m  /  before  perfonning  the 
revision.  This  copy-revise-prefer  approach  means  that  the  structure  of  any  previous  explanations 
that  use  m  1  would  remain  intact.  The  preference  over  model  fragments  m  /  <c  m2  causes  the 
derivation  of  explanation-level  preference  xo  <xp  x/,  as  described  in  section  4.6. 1 .  The  preference 
mi  <c  m2  also  indicates  opportunities  for  retrospective  explanation,  as  discussed  in  section  4.7. 
We  discuss  the  role  of  retrospective  explanation  later  in  this  chapter. 

As  noted  by  Kass  (1994),  adaptation  mechanisms  -  such  as  these  revision  heuristics  -  fall  on 
a  spectrum  from  (1)  a  multitude  of  domain-dependent  adaptation  strategies,  and  (2)  a  smaller 
number  of  very  general,  domain-independent  strategies.  In  this  simulation,  no  heuristic 
explicitly  mentions  movement  or  x/y  coordinate  quantities,  so  they  are  not  purely  domain- 
dependent;  however,  in  the  case  of  heuristic  addHiddenQtyCond  (Figure  49)  and  others  like  it, 
heuristics  can  be  very  specialized  in  their  applicability.  We  next  discuss  how  the  system  chooses 
between  heuristics  when  several  are  applicable. 
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8.2.2  Choosing  among  applicable  heuristics 

A  heuristic’s  applicability  to  a  situation  is  determined  by  its  participants  and  constraints,  and  its 
complexity  of  change  is  determined  by  its  consequences.  Some  consequences  create  new  model 
fragments  and  categories  altogether.  For  instance,  createDecreaseProcess  created  model 
fragment  /«;,  and  addHiddenQtyCond  created  a  new  conceptual  quantity  q.  These 
consequences  extend  the  domain  knowledge  of  the  agent.  Other  consequences  revise  existing 
model  fragments  (e.g.,  addHiddenQtyCond  revises  a  model  fragment  to  extend  its  conditions 
and  consequences)  and  categories.  Heuristics  are  ordered  from  minimum  to  maximum  estimated 
change  by  tallying  their  consequences.  The  cost  of  each  consequence  is  as  follows: 

•  Revising  a  conceptual  quantity’s  specification:  3 

•  Revising  (i.e.,  copying  and  revising)  a  model  fragment:  7 

•  Creating  an  altogether  new  model  fragment:  20 

•  Creating  an  altogether  new  conceptual  quantity:  20 

Using  this  cost  metric,  the  system  can  assign  a  numerical  cost  to  each  applicable  heuristic  by 
summing  the  cost  of  its  consequences.  The  system  then  sorts  heuristics  by  ascending  cost  and 
executes  them  in  that  order  until  it  can  explain  the  situation. 

8.2.3  Revising  conceptual  quantities 

Like  model  fragments,  conceptual  quantities  such  as  q  can  be  revised  using  heuristics  when  the 
system  fails  to  explain  an  explanandum.  When  the  quantity  q  is  created  by  the  heuristic 
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Heuristic  vectorizeQty 
Participants : 

?obj  Entity 

?quant  SpatialQuantity 
?c-quant  ConceptualQuantity 
?t  ProcessType 
Constraints : 

(increasing  (?quant  ?obj ) ) 

(consequence  ?t  (i-  (?quant  Pent)  (Rate  ?self) ) ) 
(condition  ?t  (>  (?c-quant  Pent)  0)) 

Consequences : 

(isa  ?c-quant  VectorQuantity) 

(revise  ?t  (addParticipant  ?d  Direction) ) 

(revise  ?t  (directionalizeQuantity  Pquant  ?d) ) 
(revise  ?t  (directionalizeQuantity  ?c-quant  ?d)  ) 


ModelFragment  m3 
Participants : 

?e  Entity 
?d  Direction 
Constraints : 
nil 

Conditions : 

(>  (q  [  ?d]  ?e)  0) 

Consequences : 

(>  (Rate  Pself)  0) 

(i+  (pos[?d]  ?e)  (Rate  Pself)) 


Figure  51:  Left:  a  heuristic  vectorizeQty  that  transforms  a  scalar  conceptual  quantity  into  a 
vector  quantity  and  revises  the  according  model  fragment  to  take  a  direction. 

Right:  1113,  the  result  of  revising  m2  (Figure  49,  right)  with  vectorizeQty. 


addHiddenQtyCond,  it  has  a  magnitude  that  permits  leftward  movement.  While 
accommodating  subsequent  training  data,  heuristics  can  revise  q  to:  (1)  add  a  vector  component 
so  that  q  has  a  spatial  direction  as  well  as  a  magnitude;  (2)  add  an  influence  from  another 
quantity,  such  that  an  object’s  size  influences  its  amount  of  q;  (3)  add  direct  influences  from 
process  rates,  e.g.,  to  describe  the  transfer  of  q  between  objects  or  consumption  of  q\  or  (4) 
change  q  to  a  quantity  that  only  exists  between  -  and  not  within  -  objects.  When  a  quantity  q  is 
revised  as  q  ’,  the  former  specification  q  remains,  so  that  existing  explanations  that  use  q  are  not 
affected.  As  with  revised  model  fragments,  a  preference  q  <c  q  ’  is  automatically  encoded  in  the 
network. 

To  illustrate  another  failure-based  revision,  consider  an  example  where  the  system  must 
explain  a  cup  sliding  to  the  right  along  a  table,  but  at  present,  it  only  has  a  model  fragment 
describing  leftward  movement  m2  (Figure  49,  right).  Rather  than  construct  a  new  model 
fragment  altogether,  it  can  use  the  heuristic  vectorizeQty  (Figure  51,  left)  to  revise  its  model 
fragment  m2  into  model  fragment  m2  (Figure  51,  right).  This  heuristic  revises  both  the  model 
fragment  as  well  as  the  conceptual  quantity  q.  The  quantity  q  now  has  a  directional  component 
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such  as  left  or  right,  and  according  to  model  fragment  m3,  something  moves  left  or  right 
when  it  has  q  in  that  direction.  The  symbol  zero  is  used  to  represent  the  directional  component 
when  a  quantity  is  not  changing. 

8.2.3. 1  Ontological  properties  of  conceptual  quantities 

We  have  described  the  mechanism  by  which  conceptual  quantities  are  revised,  but  there  are 
ontological  questions  regarding  the  initial  conceptual  quantity  q.  For  instance,  does  q  have  a 
spatial  extent?  How  does  it  combine  with  the  q  of  other  objects?  How  is  it  acquired  or 
consumed?  How  does  it  change  its  directional  component?  We  look  to  the  cognitive 
psychology  literature  for  insight. 

Pfundt  and  Duit  (1991)  analyzed  approximately  2,000  published  articles  about  novice 
misconceptions  in  the  domain  of  force  dynamics.  These  illustrate  that  novices  do  not  generally 
conceive  of  force  as  an  interaction  between  two  material  objects.  The  most  common 
misconception  is  that  force  is  a  property  of  a  single  object.  Chi  and  colleagues  (Chi,  2008; 
Reiner  et  ah,  2000;  Chi  et  ah,  1994b)  argue  that  novices  often  attribute  this  internal  property  of 
force  with  the  ontological  properties  of  a  substance  schema.  The  substance  schema  listed  in 
Reiner  et  al.  (2000)  contains  eleven  ontological  attributes,  though  these  are  not  claimed  to  be 
complete  or  globally  coherent: 

1 .  Substances  are  pushable  (able  to  push  and  be  pushed). 

2.  Substances  are  frictional  (experience  “drag”  when  moving  in  contact  with  a  surface). 

3.  Substances  are  containable  (able  to  be  contained  by  something). 

4.  Substances  are  consumable  (able  to  be  “used  up”). 
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5.  Substances  are  locational  (have  a  definite  location). 

6.  Substances  are  transitional  (able  to  move  or  be  moved). 

7.  Substances  are  stable  (do  not  spontaneously  appear  or  disappear). 

8.  Substance  can  be  of  a  corpuscular  nature  (have  surface  and  volume). 

9.  Substances  are  additive  (can  be  combined  to  increase  mass  and  volume). 

10.  Substances  are  inertial  (require  a  force  to  accelerate). 

11.  Substances  are  gravity  sensitive  (fall  downward  when  dropped). 

Not  all  of  these  attributes  are  relevant  for  our  system’s  conceptual  quantity  q  that  mediates 
motion,  but  we  use  these  guidelines  for  constraining  the  properties  of  conceptual  quantities  when 
there  is  any  question.  For  instance,  according  to  model  fragment  m2,  q  is  a  property  of  an  entity 
and  not  an  abstract  property,  so  it  is  locational  and  (in  some  sense)  containable.  Additionally, 
since  conceptual  quantities  must  be  stable,  the  system  must  justify  how  an  object’s  q  increases 
and  decreases.  This  is  achieved  using:  (1)  processes  that  describe  consumption  of  quantities  over 
time  so  that  q  is  consumable  (see  also  the  “dying  away”  p-prim  in  diSessa,  1993)  and  (2) 
processes  that  describe  the  transfer  of  quantities  between  objects  so  that  q  is  transitional . 

If  we  apply  these  principles  to  the  directional  conceptual  quantity  q  described  within  model 
fragment  m3,  an  entity  has  q  [  lef  t  ]  when  it  travels  leftward,  q  [  right  ]  when  it  travels 
rightward,  and  q  [  zero  ]  when  it  is  still.  This  means  that  other  processes  affect  the  direction  of 
an  object’s  q,  which  in  turn  affects  the  object’s  position  in  space.  Without  a  transfer  across 
objects,  the  sum  of  an  object’s  q  across  directions  is  constant.  This  satisfies  the  stability 


constraint  of  the  substance  schema. 
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In  the  psychology  literature,  Ioannides  and  Vosniadou  (2002)  are  investigating  the 
“meaning”  of  force.  In  our  model,  the  meaning  of  a  quantity  (e.g.,  q)  is  a  conjunction  of  these 
ontological  constraints  on  the  quantity,  the  direct  and  indirect  influences,  and  model  fragments 
(e.g.,  m2  in  Figure  49)  that  describe  the  existence  and  behavior  of  the  quantity  within  a  scenario. 
As  quantities  and  model  fragments  change,  so  will  the  presence  and  role  of  q  within  the 
questionnaire  scenarios  that  we  use  as  testing  data. 

Thus  far,  we  have  described  how  the  system  explains  quantity  changes  within  observations 
and  revises  model  fragments  and  quantities.  However,  the  system  also  explains  differences  in 
behavior  between  similar  observations,  using  analogy.  This  comparative  explanation  process  is 
important  for  finding  qualitative  proportionalities  between  quantities.  We  discuss  this  next. 

8.2.4  Inter-scenario  analysis 

After  the  system  explains  the  quantity  changes  within  a  comic  graph  observation,  it  retrieves  a 
similar  previous  observation  to  detennine  whether  there  are  any  discrepancies.  If  there  are 
variations  in  the  quantity  changes  between  observations  (e.g.,  one  object  moves  further  than 
another  object)  then  they  must  be  explained.  Failure  to  explain  these  discrepancies  results  in  the 
use  of  heuristics  to  revise  domain  knowledge,  as  described  above.  We  call  this  inter-scenario 
analysis,  and  we  illustrate  this  with  an  example. 

Suppose  that  the  comic  graph  labeled  “Scenario  A”  in  Figure  52  has  already  been  explained 
by  the  simulation.  Suppose  also  that  the  simulation  has  just  explained  the  quantity  changes 
within  a  second  comic  graph  labeled  “Scenario  B”  in  Figure  52.  Scenario  B  is  identical  to 
Scenario  A,  except  that  a  smaller  ball  is  kicked  a  greater  distance.  Finally,  suppose  that  both 
scenarios  were  explained  using  the  same  specification  of  q  and  model  fragment  m3  (Figure  51, 
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right),  as  well  as  process  model  fragment  instances  that  describe  the  q  [  zero  ]  of  an  object 
transitioning  to  q  [left/right]  of  the  object,  and  visa-versa,  obeying  the  stability  constraint  of 
the  substance  schema. 


Scenario  A 

Scenario  B 

•  A 

.  ^ 

Figure  52:  Comic  graph  scenarios  A  and  B  are  sufficiently  similar 
for  inter-scenario  analysis. 

After  the  system  explains  Scenario  B,  it  uses  MAC/FAC  to  retrieve  a  similar,  previously- 

49 

explained  scenario.  If  the  SME  normalized  similarity  score  "  between  the  probe  and  the  previous 
scenario  is  above  a  threshold  value  (we  use  0.95  in  our  simulation),  then  inter-scenario  analysis 
proceeds  between  the  two  scenarios.  Suppose  that  Scenario  A  is  retrieved,  and  that  the  SME 
mapping  between  Scenarios  A  and  B  exceeds  the  similarity  threshold. 

Inter-scenario  analysis  between  Scenarios  A  and  B  involves  explaining  why  corresponding 
quantities  changed  differently  in  Scenario  A  than  they  did  in  Scenario  B,  if  applicable.  For 

42  See  section  3.4.1  for  a  description  of  how  this  is  computed. 
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instance,  the  ball  in  Scenario  B  travels  a  greater  distance  along  the  x-axis  than  the  ball  in 
Scenario  A.  These  quantity  change  variations  are  detected  by  analyzing  correspondences  in  the 
SME  mapping  between  Scenarios  A  and  B,  several  of  which  are  shown  in  Figure  53.  From  these 
correspondences,  the  system  can  compute  two  inequalities,  shown  in  the  right  column  of  Figure 
53: 


(Area  ball-a)  >  (Area  ball-b) 
(Ax[left]  ball-a)  <  (Ax[left]  ball-b) 


The  inequality  between  quantity  changes  (Ax  [left]  ball-a)  <  (Ax[left]  ball-b) 
must  be  explained.  As  above,  heuristics  are  used  to  revise  knowledge  to  aid  in  explanation. 


Scenario  A  formula 

Scenario  B  formula 

Inequality  (if  applicable) 

foot-a 

foot-b 

n/a 

ground-a 

ground-b 

n/a 

ball-a 

ball-b 

n/a 

mf  i-a 

mf  i-b 

n/a 

(isa  mfi-a  m3) 

(isa  mfi-b  m3) 

n/a 

(Area  ball-a) 

(Area  ball-b) 

(Area  ball-a) 

>  (Area  ball-b) 

(Ax [left]  ball-a) 

(Ax [ left]  ball-b) 

(Ax [left]  ball-a) 

<  (Ax [left]  ball-b) 

(qfleft]  ball-a) 

(qfleft]  ball-b) 

(qfleft]  ball-a) 

?  (qfleft]  ball-b) 

(Rate  mfi-a) 

(Rate  mfi-b) 

(Rate  mfi-a) 

?  (Rate  mfi-b) 

(i+  (x-pos[left]  ball-a) 
(Rate  mfi-a) ) 

(i+  (x-pos[left]  ball-b) 
(Rate  mfi-b) ) 

n/a 

(>  (qfleft]  ball-a)  0) 

(>  (qfleft]  ball-b)  0) 

n/a 

Figure  53:  Selected  analogical  correspondences  between  Scenarios  A  and  B  (Figure  52), 

The  first  task  in  explaining  the  quantity  change  inequality  is  to  derive  other  inequalities 


between  corresponding  quantities.  As  mentioned  above,  the  movements  of  ball-a  and  ball-b 
were  explained  using  model  ms  in  Figure  5  l(right).  Since  the  ms  model  fragment  instances  mf  i- 
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a  and  mf  i-b  of  Scenarios  A  and  B  correspond  (see  Figure  53),  so  do  their  respective  rates, 

(Rate  mfi-a)  and  (Rate  mf  i-b) ,  and  their  respective  conditions  (>  (q[left]  ball-a) 
0)  and  (>  (q [left]  ball-b)  0).  The  corresponding  conditions  are  especially  important, 
since  (1)  conditions  must  hold  for  the  processes  mfi-a  and  mf  i-b  to  be  active  and  (2)  assuming 
a  closed  world,  these  processes  are  the  only  influences  of  Ax  [left]  for  both  balls.  If  we 
assume  the  rates  of  the  processes  are  the  same,43  the  variation  in  Ax  [left]  between  ball-a 
and  ball-b  is  a  factor  of  the  (q[left]  ball-a)  and  (q[left]  ball-b) ,  which  varied  the 
duration  of  these  process  instances.  This  produces  the  following  ordinal  relation  to  describe  the 
relative  q  values  in  the  transition  to  last  frame  of  the  comic  graphs: 

(q[left]  ball-a)  <  (q[left]  ball-b) 

The  variation  in  Ax  [left ]  has  been  explained  with  an  inequality  in  q  [  lef t ]  values  in  this 
state,  but  now  the  inequality  between  q  [  lef  t  ]  values  in  this  state  requires  an  explanation.  This 
will  require  that  the  system  revises  its  beliefs  about  q.  Since  q  is  a  conceptual  quantity  created 
by  the  system,  there  are  many  ways  to  explain  this  inequality.  We  use  the  substance  schema  of 
Reiner  et  al.  (2000)  to  constrain  the  system’s  explanation.  The  following  inferences  are  plausible 
with  respect  to  the  substance  schema: 

1.  ball-b  has  more  total  q  than  ball-a  in  the  movement  states,  but  this  is  consumed 
before  the  resting  state  is  reached. 

43  The  system’s  explanation  of  this  quantity  variation  relies  on  the  assumptions  the  system  makes  about  time.  For 
instance,  if  we  assume  the  transitions  between  corresponding  frames  in  Scenarios  A  and  B  take  equal  time,  then  the 
variation  in  leftward  movement  can  only  be  explained  by  varying  rates  of  change.  If  we  do  not  make  this 
assumption,  we  can  explain  variation  of  leftward  movement  with  equal  rates  of  change  and  one  process  being  active 
longer  than  the  other,  corresponding  process. 
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2.  ball-b  has  more  q  [left]  than  ball-a  in  the  movement  states,  but  this  transitions 
to  q  [  zero  ]  before  the  resting  state. 

Inference  (1)  is  not  plausible  in  our  example,  since  the  simulation  does  not  have  a  model  of  how 
a  greater  total  amount  of  the  conceptual  quantity  q  is  initially  acquired  by  the  ball.  In  simulation 
trials  where  the  system  constructs  a  model  of  q  transfer  prior  to  this  analysis,  this  is  the  path 
chosen  by  the  system.  Inference  (2)  obeys  the  stability  constraint  of  the  substance  schema  as 
well  as  the  present  properties  of  the  conceptual  quantity.  As  a  result,  the  system  must  explain 
why  ball-b  has  greater  q  [left]  and  less  q  [zero]  in  the  movement  state.  This  is  done  by 
asserting  a  new  qualitative  proportionality  to  another  varying  quantity.  In  this  case,  the 
inequality  (Area  ball-a)  >  (Area  ball-b)  is  used,  since  it  is  the  only  other  varying 
quantity.  The  system  asserts  the  following  statements  for  entities  ?ent  and  directions  ?dir: 

if  ?dir  ^  zero: 

(qprop-  (q[?dir]  ?ent)  (Area  ?ent) ) 

(qprop  (q[zero]  ?ent)  (Area  ?ent) ) 

This  states  that  all  else  being  equal,  smaller  objects  have  more  directional  (e.g.,  left  or  right) 
q,  which  propels  them  further  than  larger  objects.  Larger  objects  have  greater  q  in  the  zero 
direction.  If  a  second  quantity,  such  as  the  size  of  the  foot,  varied  in  addition  to  the  size  of  the 
balls,  neither  would  be  isolated.  Consequently,  either  or  both  might  explain  the  variation  in 
Ax  [  lef  t  ] ,  and  inter-scenario  analysis  tenninates  without  revising  the  quantity.  This  makes  the 
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simulation  more  conservative  when  hypothesizing  qualitative  proportionalities,  since  it  requires 
pairwise  quantity  variations  in  isolation. 

After  this  conceptual  quantity  is  revised  by  adding  the  above  qualitative  proportionality, 
inter-scenario  analysis  is  complete.  When  the  simulation  uses  the  revised  quantity  q  and  the 
associated  model  fragments  to  answer  the  questionnaire,  it  will  assert  that  entities  have  exactly 
one  q  property,  and  its  magnitude  is  a  function  of  its  size.  These  answers  are  consistent  with  the 
“Internal”  meaning  of  force  according  to  the  coding  scheme  of  Ioannides  and  Vosniadou  (2002). 

8.2.5  Retrospective  explanation  propagates  revisions 

We  have  described  how  the  simulation  revises  its  knowledge  when  it  fails  to  explain 
observations  or  when  it  fails  to  explain  variations  between  similar  observations.  Instead  of 
revising  a  construct  (i.e.,  model  fragment  or  quantity)  directly,  the  system  copies  it  and  then 
revises  the  copy  so  that  the  prior  construct  remains.  The  agent  then  encodes  an  epistemic 
preference  for  the  new  construct  over  the  prior  one.  Figure  50  illustrates  this  copy-revise-prefer 
behavior.  After  the  revision,  quantity  changes  that  were  explained  with  the  prior  construct  retain 
their  present  explanations,  despite  the  fact  that  these  explanations  rely  on  outdated  domain 
knowledge. 

The  process  of  retrospective  explanation,  described  in  section  4.7,  constructs  new 
explanations  to  replace  these  outdated  explanations.  Retrospective  explanation  is  achieved 
through  the  following  steps  in  this  simulation: 
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1.  Find  an  outdated  explanandum  (i.e.,  quantity  change)  m.  An  explanandum  is  outdated  if 
and  only  if  (1)  there  is  a  concept-level  preference  b  <c  b  ’  and  (2)  the  preferred 
explanation  for  m  is  x,  and  x  uses  b  and  not  b  ’. 

2.  Attempt  to  explain  the  explanandum  with  preferred  knowledge,  using  the  same 
explanation  construction  algorithms  as  above. 

3.  Compute  preferences  over  new  explanations  and  previous  explanations,  using  the  same 
explanation  evaluation  algorithms  as  above. 

4.  Map  the  explanandum  to  a  new,  preferred  explanation,  if  applicable. 

5.  If  the  outdated  explanandum  m  still  retains  its  previously  preferred  explanation,  store  the 
triple  (m,  b,  b  ’)  so  that  this  process  is  not  later  repeated  for  the  same  purpose. 

Retrospective  explanation  is  an  incremental  transition  from  one  causal  description  to  another. 
This  models  the  students’  incremental  transition  to  a  new  understanding  of  the  world.44 

In  this  simulation,  retrospective  explanation  occurs  to  completion  after  each  new  training 
datum  is  given  to  the  system.  This  means  that  every  local  revision  to  domain  knowledge  is 
immediately  used  to  explain  previous  observations. 

8.3  Simulation  results 

Here  we  describe  the  setup  and  results  of  our  simulation.  The  psychological  assumptions  and 
justification  of  our  match  with  student  data  is  addressed  in  section  after  this. 


44 


Following  McDermott  (1976),  this  is  not  to  suggest  that  the  simulation  is  itself  “understanding”  the  phenomena. 
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We  used  ten  comic  graphs  as  training  data,  for  a  total  of  58  comic  graph  frames  and  22 
instances  of  movement.  These  were  sketched  in  CogSketch.  The  system  was  also  given  starting 
knowledge  about  agency,  such  that  people  and  their  respective  body-parts  cause  their  own 
translation.  The  system  uses  this  knowledge  to  explain  how  a  person  or  body-part  (1)  starts  or 
stops  translating  or  (2)  imparts  or  consumes  a  conceptual  quantity  (e.g.,  q,  in  the  above 
examples)  if  that  quantity  causes  movement.  Modeling  how  agency  and  intentional  movement  is 
learned  is  a  nontrivial  and  interesting  research  problem,  but  is  beyond  the  scope  of  this 
simulation. 


Internal  force 
(si/e  indifferent) 


Figure  54:  Changes  in  the  simulation’s  meaning  of  force,  using 
Ioannides  and  Vosniadou’s  (2002)  student  meanings  of  force. 

For  each  comic  graph  used  as  a  training  datum,  the  system:  (1)  explains  all  quantity  changes 
within  the  comic  graph;  (2)  retrieves  a  similar  previous  comic  graph  using  MAC/FAC,  using  the 
present  one  as  a  probe;  (3)  performs  inter-scenario  analysis  if  the  present  and  previous  comic 
graphs  have  a  SME  normalized  similarity  score  above  0.95;  and  (3)  performs  complete 
retroactive  explanation  if  model  fragments  or  quantities  were  revised. 

After  a  comic  graph  is  processed  in  this  manner,  the  system  completes  the  entire 
questionnaire,  half  of  which  is  shown  in  Figure  47.  From  the  system’s  answers,  we  determine 
(1)  the  conditions  under  which  a  force-like  quantity  exists,  and  (2)  the  effect  of  factors  such  as 
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size,  height,  and  other  agents  on  the  force-like  quantity.  We  use  the  same  coding  strategy  as 
Ioannides  &  Vosniadou  (2002)  to  determine  which  meaning  of  force  the  system  has  learned, 
given  its  answers  on  the  questionnaire.  No  knowledge  revision  occurs  during  the  system’s 
completion  of  the  questionnaire. 

Figure  54  illustrates  the  transitions  in  the  concept  of  force  across  10  independent  trials  with 
different  comic  graph  order.  The  simulation  starts  without  any  process  models  or  quantities  to 
represent  force,  and  transitions  to  the  “Internal  Force”  concept  2/10  times,  and  a  size  indifferent 
“Internal  Force”  model  8/10  times,  which  was  not  reported  by  Ioannides  &  Vosniadou  (2002). 

In  these  cases,  the  force-like  quantity  (e.g.,  q)  was  not  qualitatively  proportional  to  size.  The  rest 
of  the  transitions  follow  a  similar  trajectory  to  the  student  data  in  Figure  46.  Each  trial  of  the 
simulation  completes  an  average  of  six  model  fragment  revisions  and  four  category  revisions  of  a 
placeholder  force-like  quantity  during  its  learning. 

8.4  Discussion 

We  have  simulated  the  incremental  revision  of  a  force-like  category  over  a  sequence  of 
observations.  As  it  incorporates  new  observations,  the  system  occasionally  fails  to  explain  (1) 
quantity  changes  within  the  observation  and  (2)  why  quantity  changes  vary  between  similar 
observations.  In  response  to  these  anomalous  situations,  the  system  minimally  and  incrementally 
revises  its  domain  knowledge  using  declarative  heuristics.  It  then  propagates  these  local 
revisions  to  other  contexts  via  retrospective  explanation. 

Human  conceptual  change  in  the  domain  of  force  dynamics  occurs  over  a  span  of  years  for 
the  students  in  Ioannides  and  Vosniadou  (2002).  This  can  be  inferred  from  Figure  46,  though  the 
data  at  each  age  group  were  gathered  from  different  students.  Over  these  years,  students  are 
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exposed  to  a  multitude  of  observations,  fonnal  instruction,  and  physical  interaction.  Providing 
this  amount  of  input  and  a  similarly  varied  workload  is  beyond  the  state  of  the  art  in  cognitive 
simulation.  Consequently,  this  simulation  learns  via  sketched  observations  alone,  so  the  set  of 
stimuli  is  smaller  and  much  more  refined.  Since  the  knowledge  encoded  from  CogSketch  is  not 
as  rich  as  human  perception,  our  simulation  relies  upon  the  psychological  assumptions  stated  in 
the  CogSketch  discussion  in  Chapter  3.  Before  revisiting  the  hypotheses  of  this  simulation,  we 
discuss  factors  that  enabled  the  simulation  to  transfonn  its  domain  knowledge  so  rapidly.  These 
factors  involve  the  training  data  and  the  computational  model  itself. 

Since  comic  graphs  are  already  segmented  into  qualitative  states,  the  system  does  not  have 
to  find  the  often-fuzzy  boundaries  between  physical  behaviors.  Furthennore,  the  sketches 
convey  relative  changes  in  position,  but  not  relative  changes  in  velocity,  so  the  system  needs  not 
differentiate  velocity  from  acceleration,  which  is  difficult  for  novice  students  (Dykstra  et  ah, 
1992).  Finally,  the  comic  graphs  are  sparse,  which  simplifies  the  detection  of  anomalies.  Some 
are  also  highly  analogous  (see  Figure  52),  which  facilitates  inter-scenario  analysis. 

Aside  from  the  comic  graph  stimuli,  aspects  of  the  computational  model  itself  accelerate 
learning  beyond  human  performance.  People  have  many  strategies  they  can  use  to  discredit 
anomalous  data  (Feltovich  et  ah,  2001),  and  other  tactics  to  avoid  conceptual  change,  such  as 
explaining  away,  excluding  anomalous  data  from  theories,  reinterpreting  anomalous  data  to  fit 
within  a  theory,  holding  data  in  abeyance,  and  making  partial  or  incomplete  changes  (e.g.,  Chinn 
&  Brewer,  1998).  In  fact,  complete  conceptual  change  is  actually  a  last-resort  for  children 
(Chinn  &  Brewer,  1998).  Our  system’s  sole  response  to  any  explanation  failure  is  the  revision  of 
domain  knowledge,  followed  by  exhaustive  retrospective  explanation.  Our  intent  in  this 
simulation  is  to  model  a  trajectory  of  minimal  -  yet  successful  -  conceptual  changes.  While 


247 

modeling  these  conceptual  change  avoidance  strategies  is  beyond  the  scope  of  the  present 
simulation,  it  is  an  interesting  opportunity  for  future  work,  and  we  revisit  this  idea  in  Chapter  9. 

A  final  possible  cause  for  the  simulation’s  accelerated  learning  is  the  heuristics  used  by  the 
simulation.  The  mechanisms  by  which  students  spontaneously  revise  their  domain  knowledge 
are  unknown.  As  discussed  in  Chapter  2,  there  is  considerable  debate  regarding  how  such 
knowledge  is  even  represented  and  organized.  The  heuristics  used  in  this  simulation  may  skip 
intennediate  steps,  and  thereby  make  larger  changes  than  people  spontaneously  make  to  their 
mental  models  and  categories.  Alternatively,  the  heuristics  used  in  this  simulation  may  revise 
domain  knowledge  in  altogether  different  fashions  than  children  do  upon  explanation  failure. 

For  example,  conceiving  of  force  as  an  interaction  between  objects  (e.g.,  “Push/Pull”  and 
“Acquired  &  Push/Pull”  meanings)  may  be  the  result  of  social  interaction  and  reading  (e.g.,  the 
familiar  sentence  “A  force  is  push  or  a  pull”)  and  not  of  error-based  revision. 

The  three  trajectories  (i.e.,  unique  paths  through  the  graph)  illustrated  in  Figure  54  describe 
plausible  paths  through  the  human  data  in  Figure  46,  supporting  the  hypothesis  that  our 
explanation-based  framework  can  simulate  human  category  revision.  The  most  popular  -  but 
still  incorrect  -  category  of  force  “Gravity  &  Other”  is  not  reached  by  the  simulation.  This 
category  requires  the  mention  of  gravity,  which  is  not  learned  by  the  simulation,  and  is  almost 
certainly  learned  by  students  through  formal  instruction  and  social  interaction. 

This  simulation  supports  the  hypothesis  that  compositional  model  fragments  can  simulate 
the  mental  models  of  the  students  in  this  domain.  Compositional  models  are  used  here  to  infer 
the  presence  of  unobservable,  force-like  quantities  within  scenarios.  The  system  infers  the 
presence  and  relative  magnitudes  of  these  quantities  in  a  fashion  comparable  with  students,  and 
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is  able  to  simulate  multiple  student  misconceptions  on  the  same  questionnaire.  This  supports  the 
knowledge  representation  hypothesis. 
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Chapter  9:  Conclusion 

“Nothing  endures  but  change” 

-  Heraclitus 

We  have  described  a  computational  model  of  conceptual  change  and  used  it  to  simulate  results 
from  the  literature  on  conceptual  change  in  students  in  different  domains.  Chapter  1  presented 
the  claims  of  the  dissertation,  an  outline  of  our  model,  and  its  psychological  assumptions. 
Chapter  2  discussed  four  existing  theories  of  conceptual  change  and  areas  of  disagreement 
between  them  to  identify  where  our  model  could  shed  some  light.  Chapter  3  reviewed  the  AI 
techniques  used  in  the  computational  model,  and  Chapter  4  presented  the  computational  model 
itself.  The  computational  model  was  used  to  perform  four  simulations,  described  in  Chapters  5- 
8,  providing  empirical  evidence  to  support  the  claims  of  this  dissertation. 

This  chapter  revisits  our  claims  in  light  of  the  evidence  provided  by  the  simulations.  We 
then  discuss  related  AI  systems  and  compare  our  computational  model  to  the  other  theories  of 
conceptual  change  described  in  Chapter  2.  We  close  with  a  discussion  of  general  limitations  and 
opportunities  for  future  work. 

9,1  Revisiting  the  claims 

Here  we  discuss  each  claim  of  this  dissertation.  The  first  claim  is  about  knowledge 
representation: 

Claim  1 :  Compositional  qualitative  models  provide  a  psychologically  plausible 
computational  account  of  human  mental  models. 
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Claim  1  is  not  a  new  idea,  since  simulating  human  mental  models  was  an  early  motivation 
for  qualitative  modeling  in  AI  (Forbus  &  Gentner,  1997).  However,  the  simulations  described 
in  Chapters  5-8  offer  novel  evidence  in  support  of  this  claim.  Since  this  is  a  knowledge 
representation  claim,  it  we  supported  it  by  (1)  observing  how  people  construct  explanations  and 
solve  problems  with  their  mental  models  from  the  cognitive  science  literature  and  (2)  using 
compositional  qualitative  models  to  construct  the  same  explanations  and  solve  the  same 
problems.  We  used  qualitative  models  in  all  four  simulations,  to  simulate  student  problem¬ 
solving  in  three  domains: 

1.  Force  dynamics  (Chapters  5  and  8) 

2.  Astronomy  (Chapter  6) 

3.  Biology  (Chapter  7) 

In  addition,  our  system  used  the  qualitative  models  that  it  learned  to  perform  different 
problem-solving  tasks,  with  results  similar  to  students: 

1 .  Explaining  causal  models  of  a  dynamic  system  (Chapters  6  and  7) 

2.  Predicting  the  next  state  of  a  scenario  (Chapter  5) 

3.  Explaining  abstract  events  in  sketched  scenarios  (Chapter  5) 

4.  Explaining  hidden  mechanisms  in  sketched  scenarios  (Chapter  8) 


The  second  claim  involves  learning  by  induction: 
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Claim  2\  Analogical  generalization,  as  modeled  by  SAGE,  is  capable  of  inducing  qualitative 
models  that  satisfy  Claim  1 . 

Claim  2  is  a  novel  claim,  since  AI  systems  have  not  previously  induced  qualitative  models 
using  SAGE.  Chapter  5  supports  this  claim  with  empirical  evidence  by  using  sketched 
observations  as  training  data,  inducing  qualitative  models  from  these  training  data,  and  then 
using  the  resulting  qualitative  models  to  perform  two  problem-solving  tasks  in  a  fashion 
consistent  with  human  students. 

As  we  describe  in  Chapter  5,  SAGE  does  not  produce  qualitative  models  directly;  rather, 
SAGE  produces  probabilistic  generalizations  of  the  input  observations.  The  simulation 
transforms  these  into  qualitative  models  by  (1)  filtering  out  low-probability  statements  and  (2) 
creating  a  qualitative  model  using  the  temporal  data  within  the  remaining  high-probability 
statements.  The  resulting  model  describes  the  participants,  preconditions,  causes,  and  effects  of 
events. 

The  third  claim  involves  modeling  two  types  of  conceptual  change: 

Claim  3:  Human  mental  model  transformation  and  category  revision  can  both  be  modeled 
by  iteratively  (1)  constructing  explanations  and  (2)  using  meta-level  reasoning  to  select 
among  competing  explanations  and  revise  domain  knowledge. 


Chapter  4  described  how  explanations  are  constructed  and  how  meta-level  reasoning  decides 
which  explanation  is  preferred,  when  multiple  explanations  apply.  When  the  model  replaces  its 
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preferred  explanation  for  a  phenomenon  (e.g.,  how  blood  flows  from  the  heart  to  the  body)  with 
a  new  explanation,  it  will  use  the  beliefs  within  the  new  explanation  to  solve  similar  problems 
and  answer  related  questions  in  the  future.  This  means  that  replacing  a  preferred  explanation  is  a 
context-sensitive  revision  of  beliefs.  The  simulations  in  Chapters  6,  7,  and  8  exemplify  this 
behavior.  In  these  three  simulations,  a  sequence  of  these  revisions  simulate  the  adoption  of  new 
causal  mechanisms  (Chapter  6),  the  integration  of  new  components  into  an  existing  mental 
model  (Chapter  7),  and  the  transformation  of  a  category  (Chapter  8). 

The  third  claim  also  mentions  meta-level  revision.  In  Chapter  8,  the  system  copies  and 
revises  its  domain  knowledge  when  it  fails  to  consistently  explain  a  phenomenon.  By  using 
declarative  heuristics,  the  model  can  estimate  the  amount  of  change  a  heuristic  will  incur  to 
domain  knowledge  and  then  choose  the  one  that  incurs  the  least  estimated  change.  This  revision 
operation  frees  the  system  from  a  failure  mode,  so  the  system  then  resumes  the  above 
explanation  construction  and  explanation  evaluation  methods. 

The  simulation  results  presented  here  provide  evidence  that  my  model  is  a  plausible  account 
of  human  conceptual  change. 

9.2  Related  work  in  AI 

Here  we  discuss  other  AI  systems  that  learn  about  new  quantities,  causal  mechanisms,  and  causal 
relationships  between  phenomena.  Only  two  of  the  systems  we  review,  INTHELEX  and 
ToRQUE2,  have  been  used  to  simulate  human  conceptual  change.  Since  the  rest  of  these 
systems  are  not  cognitive  models,  we  compare  them  to  our  model  in  terms  of  the  knowledge 
representations  and  algorithms  used,  since  there  are  relevant  overlaps. 
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The  Qualitative  Learner  of  Action  and  Perception  (QLAP)  (Mugan  &  Kuipers,  2011)  learns 
hierarchical  actions  from  continuous  quantities  in  an  environment.  QLAP  uses  qualitative 
reasoning  to  discretize  continuous  quantities  into  intervals,  using  the  quantities’  landmark  values. 
Dynamic  Bayesian  networks  (DBNs)  are  then  used  over  open  intervals  and  values  in  each 
quantity  space  to  track  contingencies  between  qualitative  values  and  events  in  the  world.  This  is 
useful  for  learning  preconditions  for  events  in  a  continuous  world.  This  could  provide  an 
account  for  how  preconceptions  might  be  learned  from  experience,  but  does  not  account  for  how 
they  are  revised  by  instruction  or  explanation  failures. 

Automated  Mathematician  (AM)  (Lenat  &  Brown,  1984)  was  an  automated  discovery 
system  that  used  heuristics  to  apply  and  revise  domain  knowledge.  AM  operated  within  the 
domain  of  mathematics,  with  its  concepts  represented  as  small  Lisp  programs.  The  control 
structure  involved  selecting  a  mathematical  task  from  the  agenda  and  carrying  it  out  with  the 
help  of  heuristics  that  activate,  extend,  and  revise  AM’s  mathematical  concepts.  The 
mathematical  concepts  were  then  used  for  solving  problems  on  AM’s  agenda.  EURISKO 
(Lenat,  1983)  improved  upon  AM  by  using  a  more  constrained  frame-based  representation  and 
allowing  heuristics  to  modify  other  heuristics.  This  provided  a  more  sophisticated  meta-level, 
where  components  influenced  each  other  in  addition  to  the  mathematical  concepts.  Both  AM 
and  EURISKO  contained  structures  designed  to  control  and  mutate  the  object-level  concepts  that 
did  the  primary  domain-level  reasoning.  Also,  both  systems  relied  on  humans  to  evaluate  the 
intennediate  products  of  reasoning,  where  our  model  learns  autonomously  from  instruction  and 
observation.  Additionally,  our  model  incorporates  other  types  of  reasoning  such  as  analogy, 
abduction,  and  qualitative  reasoning  to  leam  in  scientific  domains. 
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Meta- AQUA  (Ram  &  Cox,  1994;  Cox  &  Ram,  1999)  is  a  story  understanding  system  that 
leams  from  expectation  failures.  The  system  monitors  its  progress  in  explaining  events  within 
stories.  When  explanation  fails,  it  triggers  meta-level  control  to  set  knowledge  goals  such  as 
reorganizing  hierarchies  and  acquiring  new  information.  It  does  this  using  two  general 
representations  for  metareasoning:  (1)  Meta-XPs,  which  describe  the  system’s  goal-directed 
reasoning,  and  (2)  Introspective  Meta-XPs,  which  describe  a  failure  in  reasoning,  rationale  for 
the  failure,  the  knowledge  goals  to  solve  the  failure,  and  algorithms  for  satisfying  the  knowledge 
goals.  Like  our  category  revision  simulation  in  Chapter  8,  Meta-AQUA  uses  metareasoning  in 
reaction  to  failure  by  identifying  deficits  in  knowledge  and  proposing  repairs. 

ECHO  (Thagard,  2000)  is  a  connectionist  model  that  uses  constraint  satisfaction  to  judge 
hypotheses  by  their  explanatory  coherence.  This  is  designed  to  model  how  people  might  revise 
their  beliefs,  given  the  propositions  and  justification  structure  in  their  working  memory.  ECHO 
operates  at  the  level  of  propositions,  creating  excitatory  and  inhibitory  links  between  consistent 
and  inconsistent  propositions,  respectively.  ECHO  uses  a  winner-take-all  network,  which,  while 
computationally  powerful,  means  that  it  cannot  distinguish  between  absence  evidence  for 
competing  propositions  versus  balanced  conflicting  evidence  for  them.  ECHO  does  not  generate 
its  own  theories  or  justification  structure,  as  our  system  does. 

ACCEPTER  (Leake,  1992;  Schank  et  ah,  1994)  is  a  case-based  reasoning  system  that 
detects  anomalies  within  a  situation  and  resolves  them  by  constructing  explanations.  After 
detecting  an  anomaly,  ACCEPTER  encodes  an  anomaly  characterization  that  sets  knowledge 
goals  and  helps  retrieve  relevant  explanation  patterns  (Schank,  1986)  from  a  library  thereof.  It 
then  evaluates  candidate  explanation  patterns  with  respect  to  whether  it  explains  the  anomaly, 
and  whether  it  is  plausible.  For  instance,  explaining  the  Challenger  explosion  as  a  Russian 
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sabotage  is  implausible  because  Russia  would  not  risk  a  dangerous  confrontation  with  the  United 
States.  As  in  our  model,  constructing  and  evaluating  explanations  is  central  to  ACCEPTER; 
however,  our  system  also  replaces  explanations  with  preferred  ones  to  perform  belief  revision. 

One  problem  of  case-based  explanation  systems  such  as  ACCEPTER  is  that  retrieved  cases 
and  explanation  patterns  may  not  apply  to  the  present  context.  When  TWEAKER  (Schank  et  al., 
1994)  retrieves  an  explanation  that  is  a  close  -  but  not  perfect  -  match  to  the  current  problem,  it 
uses  adaptation  strategies  to  build  variations  of  the  explanation.  These  adaptations  include 
replacing  an  agent,  generalizing  or  specifying  slot-fillers,  and  so-forth.  TWEAKER  can  also  use 
strategy  selection  to  choose  between  possible  strategies,  which  helps  guide  search  through  a 
large  explanation  search  space.  Our  category  revision  simulation  in  Chapter  8  is  similar  to 
TWEAKER  in  that  it  uses  revision  heuristics  as  its  adaptation  strategies,  and  it  scores  and  sorts 
heuristics  as  its  strategy  selection. 

INTHELEX  (Esposito  et  al,  2000)  is  an  incremental  theory  revision  program  that  has 
modeled  conceptual  change  as  supervised  learning.  It  implements  belief  revision  as  theory 
refinement,  so  it  minimally  revises  its  logical  theories  whenever  it  encounters  an  inconsistency. 
INTHELEX  is  capable  of  learning  several  intuitive  theories  of  force  from  observations,  but  it  has 
not  simulated  the  transition  from  one  intuitive  theory  to  another.  The  transition  between  intuitive 
theories  (e.g.,  in  Chapter  8)  is  a  central  principle  for  simulating  conceptual  change,  so  while 
INTHELEX  may  simulate  how  intuitive  theories  are  acquired,  it  does  not  simulate  conceptual 
change  at  the  scale  proposed  in  this  dissertation. 

The  ToRQUE  and  ToRQUE2  systems  (Griffith,  Nersessian,  &  Goel,  1996;  2000)  solve 
problems  using  structure-behavior-function  (SBF)  models.  To  solve  a  new  target  problem, 
ToRQUE2  retrieves  analogs  to  the  present  problem,  and  then  applies  transfonnations  to  the 
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analog  or  target  problems  to  reduce  their  differences.  This  generates  additional  SBF  models  and 
generates  a  solution  to  the  target  problem  using  transformed  domain  knowledge.  ToRQUE2  has 
simulated  how  scientists  solve  problems  during  think-aloud  protocols,  where  the  scientists 
change  their  understanding  throughout  the  problem-solving  session.  For  instance,  a  scientist 
initially  believes  that  the  stretch  of  a  spring  is  due  to  its  flexibility,  and  then  realizes  that  a  spring 
maintains  constant  slope  when  stretched  through  torsion  in  the  spring’s  wire.  The  authors 
conclude  that  this  spring  example  is  “an  instance  of  highly  creative  problem  solving  leading  to 
conceptual  change”  (p.  1).  Since  ToRQUE2  revises  domain  knowledge  to  overcome  failures  in 
problem-solving,  and  the  new  spring  model  conflicts  with  the  previous  one,  this  is  a  type  of 
mental  model  transformation.  By  comparison,  conceptual  change  is  triggered  differently  in  our 
cognitive  model,  and  our  model  searches  for  consistent,  low-complexity  models  that  fit  multiple 
observations  (e.g.,  Chicago’s  and  Australia’s  seasons,  in  Chapter  6). 

Explanation-Based  Learning  (EBL)  systems  (DeJong,  1993)  leam  by  creating  explanations 
from  existing  knowledge.  Many  EBL  systems  learn  by  chunking  explanation  structure  into  a 
single  rule  (e.g.,  Laird  et  ah,  1987).  Chunking  speeds  up  future  reasoning  by  avoiding  extra 
instantiations  when  a  macro-level  rule  exists,  but  it  does  not  change  the  deductive  closure  of  the 
knowledge  base,  and  therefore  cannot  model  the  repair  of  incorrect  knowledge.  Other  systems 
use  explanations  to  repair  knowledge.  For  example  (Winston  and  Rao,  1990)  uses  explanations 
to  repair  error-prone  classification  criteria,  where  explanations  are  trees  of  if-then  rules  over 
concept  features.  Upon  misclassification,  the  system  analyzes  its  explanations  and  creates  censor 
rules  to  prevent  future  misclassification.  Similarly,  our  model  detects  inconsistencies  within  and 
across  explanations  in  its  analysis,  but  it  encodes  epistemic  preferences  (rather  than  censor  rules) 


to  resolve  these  issues. 
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Other  systems  construct  explanations  using  abduction  to  extend  or  revise  their  domain 
knowledge.  Molineaux  et  al.  (2011)  describes  a  system  that  determines  the  causes  of  plan 
failures  through  abduction.  Abduction  increases  the  agent’s  knowledge  of  hidden  variables  and 
consequently  improves  the  performance  of  planning  in  partially-observable  environments. 
Similarly,  ACCEL  (Ng  &  Mooney,  1992)  creates  multiple  explanations  via  abduction,  and  it 
uses  simplicity  and  set-coverage  metrics  to  determine  which  is  best.  When  performing  diagnosis 
of  dynamic  systems,  ACCEL  makes  assumptions  about  the  state  of  components  (e.g.,  a 
component  is  abnonnal  or  in  a  known  fault  mode),  and  minimizes  the  number  of  assumptions 
used.  By  contrast,  when  our  system  evaluates  explanations,  some  assumptions  (e.g.,  quantity 
changes)  are  more  expensive  than  others,  and  other  artifacts  (e.g.,  contradictions,  model 
fragments,  and  model  fragment  instances)  incur  costs. 

Other  systems  reason  with  abduction  under  uncertainty  while  still  using  structured  relational 
knowledge.  Bayesian  Abductive  Logic  Programs  (Raghavan  &  Mooney,  2010)  and  Markov 
Logic  Networks  (Richardson  &  Domingos,  2006;  Singla  &  Mooney,  2011)  have  been  used  for 
these  purposes.  Uncertainty  is  an  important  consideration  for  reasoning  about  psychological 
causality  (e.g.,  recognizing  an  agent’s  intent)  and  for  reasoning  about  physical  phenomena  in  the 
absence  of  mechanism-based  knowledge.  In  this  thesis  we  are  specifically  concerned  with 
abduction  using  mechanism-based  knowledge,  so  probability  distributions  are  not  as  central  as 
for  other  tasks  and  domains.  That  said,  probabilities  might  represent  the  agent’s  purported 
likelihood  of  a  given  belief  or  model  fragment  in  one  of  the  domains  simulated  here,  which  could 
direct  the  search  for  explanations.  We  revisit  this  idea  below. 

Previous  research  in  AI  has  produced  postulates  for  belief  revision  in  response  to 
observations.  The  AGM  postulates  (Alchourron  et  al.,  1985)  describe  properties  of  rational 
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revision  operations  for  expansion,  revision,  and  contraction  of  propositional  beliefs  within  a 
deductively-closed  knowledge  base.  Katsuno  and  Mendelzon’s  (1991)  theorem  shows  that  these 
postulates  can  be  satisfied  by  a  revision  mechanism  based  on  total  pre-orders  over  prospective 
KB  interpretations.  Like  these  approaches,  our  conceptual  change  model  computes  total  pre¬ 
orders  over  belief  sets,  but  our  system  is  concerned  with  consistency  within  and  across  preferred 
explanations  rather  than  within  the  entire  KB.  Further,  since  our  model  has  an  explanatory  basis, 
it  uses  truth  maintenance  methods  (Forbus  &  de  Kleer,  1993)  to  track  the  justification  structure 
and  assumptions  supporting  its  beliefs. 

9.3  Comparison  to  other  theories  of  conceptual  change 

Our  computational  model  shares  some  psychological  assumptions  with  individual  theories  of 
conceptual  change  discussed  in  Chapter  2.  We  review  important  overlaps  and  disagreements 
with  each  theory,  citing  examples  from  our  simulations  to  illustrate. 

9.3.1  Knowledge  in  pieces 

Like  the  knowledge  in  pieces  perspective  (diSessa,  1988;  1993;  diSessa  et  ah,  2004),  our 
computational  model  assumes  that  domain  knowledge  is  -  at  some  level  -  stored  as  individual 
elements.  These  elements  are  combined  into  larger  aggregates  to  predict  and  explain 
phenomena,  and  can  then  be  recombined  into  new  constructs  to  accommodate  new  infonnation. 
Additionally,  when  new  infonnation  is  encountered  via  observation  or  formal  instruction,  the 
new  information  coexists  with  the  previous  elements,  even  when  they  are  mutually  incoherent  or 
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Our  theory  diverges  from  knowledge  in  pieces  regarding  the  structures  that  organize  these 
domain  knowledge  elements  and  the  representation  of  the  elements  themselves.  In  our  model, 
explanations  are  persistent  structures  that  aggregate  domain  knowledge  elements.  The 
knowledge  in  preferred  explanations  is  reused  to  explain  new  observations  and  solve  new 
problems.  Belief  revision  is  performed  by  revising  which  explanations  are  preferred,  which 
thereby  affects  future  reuse  of  knowledge.  By  contrast,  knowledge  in  pieces  assumes  a  set  of 
structured  cueing  priorities  that  activate  these  elements  in  working  memory,  based  on  how  these 
elements  were  previously  coordinated  (diSessa,  1993).  Belief  revision  is  achieved  by  altering 
these  priorities.  Additionally,  knowledge  in  pieces  assumes  several  types  of  domain  knowledge, 
including  p-prims,  propositional  beliefs,  causal  nets,  and  coordination  classes.  By  contrast,  our 
model  uses  only  propositional  beliefs  and  model  fragments. 

9.3.2  Carey’s  theory 

Like  Carey’s  (2009)  theory  of  conceptual  change,  our  computational  model  assumes  that  a  single 
category  such  as  force  can  have  multiple,  incommensurable  meanings.  The  student  has 
simultaneous  access  to  both  of  these  meanings,  but  they  are  contextualized.  In  both  Carey’s 
theory  and  our  model,  conceptual  change  is  driven  by  these  category-level  conflicts,  but  in  our 
model,  conceptual  change  is  also  driven  other  explanatory  inconsistencies  and  preferences.  Also 
like  Carey’s  theory,  our  computational  model  relies  on  the  processes  of  analogy,  abduction,  and 
model-based  reasoning  to  achieve  conceptual  change. 

Our  model  differs  from  Carey’s  theory  on  how  knowledge  is  contextualized.  Carey  (2009) 
assumes  that  new  conceptual  systems  are  established  to  store  incommensurable  categories,  and 
that  analogy,  abduction,  and  model-based  thought  experiments  add  causal  structure  to  these  new 
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conceptual  systems.  Our  model’s  knowledge  is  contextualized  at  the  explanation  level,  so  that 
two  phenomena  may  be  explained  using  mutually  incoherent  or  inconsistent  explanations.  When 
our  model  finds  contradictions  across  preferred  explanations,  these  are  resolved  locally,  to 
increase  the  coherence  between  these  explanations.  Thus,  our  model  adopts  new  information 
and  revises  its  explanations  to  improve  coherence  (e.g.,  by  reducing  cost,  in  Chapter  6),  but  it 
does  not  strongly  enforce  coherence  in  a  discrete  conceptual  system. 

9.3.3  Chi’s  categorical  shift 

Like  Chi’s  (2008;  2000)  theory  of  conceptual  change  and  mental  model  transformation,  our 
computational  model  relies  on  self-directed  explanation  to  integrate  new  information.  Chi’s 
(2008)  account  of  mental  model  transformation  involves  a  series  of  belief-level  refutations, 
which  cause  belief  revision  and  the  adoption  of  instructional  material.  These  belief  revisions 
change  the  structure,  assumptions,  and  predictions  of  a  mental  model.  In  our  system,  the  model 
of  a  system  such  as  the  human  circulatory  system  is  comprised  of  model  fragments  and 
propositional  beliefs.  As  in  Chi’s  theory,  revising  propositional  beliefs  can  change  the  structure 
of  this  model. 

Our  model  differs  from  Chi’s  theory  in  how  it  revises  information.  Chi  (2008)  assumes  that 
categories  are  directly  shifted  across  ontological  categories,  e.g.,  the  category  “force”  is  shifted 
from  a  “substance”  to  a  “constraint-based  interaction.”  The  category  is  only  shifted  once  the 
target  category  (i.e.,  “constraint-based  interaction”)  is  understood.  The  number  of  resulting 
changes  to  ontological  properties  and  the  unfamiliarity  of  the  target  category  both  increase  the 
difficulty  of  the  change.  By  contrast,  the  simulation  in  Chapter  8  uses  heuristics  to  revise  the 
properties  of  a  category,  and  the  new  and  old  categories  coexist,  albeit  in  different  explanations. 
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The  system  incrementally  transitions  to  the  new  category  by  a  process  of  retrospective 
explanation. 

9.3.4  Vosniadou’s  framework  theory 

Vosniadou’s  (1994;  2002;  2007)  theory  of  conceptual  change  assumes  that  students  have  a 
generally  coherent  framework  theory.  The  framework  theory  consists  of  specific  theories  about 
phenomena,  mental  models  of  systems  and  objects,  and  presupposition  beliefs  that  constrain  the 
theories  and  mental  models  within  the  framework.  Our  model  has  similar  interdependencies 
between  constructs,  but  these  are  soft  constraints.  For  example,  in  Chapter  6,  the  system  was 
given  the  credible  information  that  Australia  and  Chicago  experience  opposite  seasons  at  the 
same  time.  This  information  in  adopted  domain  knowledge  constrained  the  explanations  of 
Australia’s  and  Chicago’s  seasons.  In  this  manner,  credible  beliefs  in  adopted  domain 
knowledge  are  analogous  to  presuppositions,  and  specific  theories  are  analogous  to  explanations. 

One  important  difference  between  Vosniadou’s  theory  and  our  model  is  that  Vosniadou’s 
theory  assumes  a  generally  coherent  framework  theory,  where  our  model  utilizes  local 
explanatory  structures.  In  our  model,  coherence  and  consistency  are  secondary,  macro-level 
phenomena;  they  are  not  hard  requirements  on  the  system  of  beliefs.  Our  model  holds  intemally- 
consistent,  globally-inconsistent  explanations  in  memory  simultaneously  and  then  increases 
global  coherence  using  cost-based  belief  revision  (e.g.,  in  the  seasons  simulation  in  Chapter  6) 
and  retrospective  explanation  (e.g.,  in  the  simulations  in  Chapters  7  and  8). 

Like  Vosniadou’s  theory,  our  model  makes  the  minimal  change  to  categories  such  as  force 
(e.g.,  in  Chapter  8)  to  resolve  contradictions.  Importantly,  the  prior  category  of  force  ceases  to 
exist  in  Vosniadou’s  framework  theory  because  it  is  inconsistent  with  the  new  version  of  the 
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category.  Conversely,  our  model  retains  the  prior  category  and  encodes  a  preference  for  the 
revised  version.  It  then  incrementally  transitions  to  it  via  retrospective  explanation,  when 
possible  (e.g.,  in  Chapter  8).  This  means  that  in  our  model,  the  agent’s  knowledge  is  not  globally 
coherent  or  even  globally  consistent.  The  processes  of  cost-based  belief  revision  (Chapter  6)  and 
preference-based  retrospective  explanation  (Chapters  7  and  8)  make  local,  incremental  repairs  to 
improve  adopt  preferred  knowledge  and  reduce  complexity. 

9.3.5  Novel  aspects  of  our  model  as  a  theory  of  conceptual  change 

As  a  theory  of  human  conceptual  change,  our  model  relies  more  heavily  on  the  processes  of 
explanation  (e.g.,  Chapters  6-8)  and  comparison  (e.g.,  Chapters  5  and  8)  than  these  other  theories 
of  conceptual  change.  As  discussed  in  Chapter  1,  our  model  assumes  that  explanations  are 
persistent  structures  that  organize  domain  knowledge.  Further,  it  assumes  that  phenomena  are 
associated  with  their  preferred  explanation  in  memory,  so  that  people  can  retrieve  a  previously- 
explained  observation  and  use  its  explanation  -  or  the  knowledge  therein  -  to  explain  a  new 
observation  using  first  principles  reasoning.  The  assumption  that  people  retain  the  complete 
structure  of  explanations  is  probably  too  strong,  and  we  discuss  opportunities  for  relaxing  this 
assumption  below. 

In  the  theories  of  Chi,  Carey  and  Vosniadou,  we  can  point  to  a  “completed”  state  of 
conceptual  change.  Consider  the  following  examples  of  completing  conceptual  change: 

•  In  Chi’s  theory,  consider  a  student  who  conceives  of  “force”  as  a  type  of  “substance.” 

She  learns  a  target  category  such  as  “constraint-based  interaction”  (Chi  et  ah,  1994a; 
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Chi,  2008),  and  then  shifts  the  concept  of  “force”  to  become  a  subordinate  of  this  target 
category. 

•  In  Carey’s  theory,  consider  a  student  who  has  mistaken  knowledge  of  “force”  and 
“mass”  concepts.  During  formal  instruction,  a  new  conceptual  system  is  established  to 
store  new  categories  of  “force”  and  “mass”  that  are  incommensurable  with  existing 
categories  of  the  same  name.  Instruction  provides  the  relational  structure  between  these 
new  symbols,  and  modeling  processes  provides  and  causal  structure  for  the  new 
conceptual  system. 

•  In  Vosniadou’s  theory,  consider  that  a  student  believes  the  earth  is  flat,  like  a  pancake. 
She  revises  set  of  presuppositions  are  about  the  earth,  and  now  conceives  of  it  as  an 
astronomical  object.  This  means  that  objects  on  the  “sides”  and  “bottom”  of  the  earth  do 
not  fall  off.  This  alters  the  constraints  on  her  mental  models  of  the  earth,  so  she  revises 
her  mental  model  of  the  earth  to  be  a  sphere,  with  people  living  on  the  “sides”  and 
“bottom”  as  well. 

Is  there  a  similar  absolutely  “completed”  narrative  for  our  model?  It  seems  unlikely.  To 
illustrate,  suppose  that  our  model  has  learned  and  used  a  category  of  force  similar  to  the 
“Internal”  meaning  of  force  (see  Chapter  8)  to  explain  many,  diverse,  phenomena.  If  it  copies 
and  revises  this  category  of  force,  it  can  quickly  use  the  revised  version  to  retrospectively  explain 
a  very  small  but  salient  subset  of  her  experiences.  If  these  experiences  are  the  ones  most 
frequently  retrieved  for  future  learning  and  question  answering,  the  new  category  and  model 
fragments  will  be  propagated.  However,  a  completed  conceptual  change  would  require  that 
every  observation  explained  with  the  prior  category  is  retrospectively  explained  with  the  new 
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category.  This  seems  unlikely.  However,  it  does  capture  an  important  property  of  human  mental 
model  reasoning,  that  people  do  indeed  have  multiple,  inconsistent  models  for  the  same 
phenomena  in  different  circumstances  (Collins  &  Gentner,  1987). 

The  absence  of  a  “completed”  state  in  our  model  means  that  it  does  not  simulate  a  strong 
“gestalt  switch”  (Kuhn,  1962)  between  representations.  While  we  have  modeled  revolutionary 
local  changes  to  sets  of  explanations  (Chapter  6)  or  representations  (Chapter  8),  the  propagation 
and  belief  revision  across  contexts  is  an  incremental,  evolutionary  process.  This  propagation 
process  is  more  amenable  to  Toulmin’s  (1972)  model  of  conceptual  change  in  science,  which 
abandons  a  discrete  notion  of  “before  and  after.” 

9.4  Future  work  and  limitations 

Conceptual  change  is  vast.  In  terms  of  time,  psychological  conceptual  change  in  a  domain  such 
as  force  dynamics  can  take  place  over  at  least  a  decade  (e.g.,  Ioannides  and  Vosniadou,  2002) 
and  misconceptions  are  often  retained  despite  years  of  fonnal  instruction  (Clement,  1982).  In 
terms  of  information,  human  conceptual  change  is  promoted  by  specialized  curricula  (e.g., 
Brown,  1994)  and  hindered  by  years  of  using  productive  misconceptions  (Smith,  diSessa,  and 
Roschelle,  1993).  In  terms  of  cognitive  processes,  conceptual  change  is  driven  by  model-based 
reasoning  (Nersessian,  2007;  Griffith  et  ah,  2000),  analogy  (Brown  &  Clement,  1989;  Gentner  et 
ah,  1997;  Carey,  2009),  anomaly  (Chinn  &  Brewer,  1998;  Posner  et  al.,  1982),  explanation 
construction  (Chi  et  al.,  1994a;  Sherin  et  al.,  2012),  social  factors  (Pintrich,  Marx,  &  Boyle, 
1993),  and  belief  refutation  (Chi,  2000;  2008).  There  is  much  to  be  done  to  model  the  full  range 
of  this  phenomenon. 
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Our  model  may  be  extended  to  capture  more  aspects  of  psychological  conceptual  change 
along  all  of  these  dimensions.  Each  opportunity  for  extension  represents  a  current  limitation  in 
our  model,  so  we  discuss  these  in  tandem.  We  also  discuss  how  a  model  of  conceptual  change 
might  be  practically  applied  in  other  software  systems. 

9.4.1  Simulating  over  larger  timescales 

While  the  simulations  presented  here  capture  the  qualitative  characteristics  and  trajectories  of 
psychological  conceptual  change,  the  changes  occur  over  many  orders  of  magnitude  fewer 
stimuli  than  students.  To  capture  a  humanlike  timescale  of  conceptual  change,  we  need  to  adjust 
(1)  the  system’s  response  to  explanation  failure  and  (2)  the  number  and  nature  of  training  data. 

Our  model  is  more  proactive  than  students  in  terms  of  changing  its  knowledge.  One  reason 
for  this  is  that  people  have  many  responses  to  anomalous  data  besides  revising  their  domain 
knowledge.  Several  anomaly-response  actions  have  been  identified  in  Chinn  &  Brewer  (1993; 
1998),  such  as  ignoring  anomalous  data,  holding  the  data  in  abeyance,  exempting  the  data  from  a 
theory’s  applicability,  and  re-explaining  the  data  to  fit  within  a  theory.  Feltovich  et  al.  (2001) 
identifies  additional  tactics  people  employ  to  prevent  making  changes  to  domain  knowledge. 
Implementing  additional  strategies  for  explanation  failure  will  slow  the  rate  of  conceptual  change 
in  simulation.  Making  these  decisions  requires  access  to  metaknowledge  about  the  to-be- 
changed  beliefs,  much  of  which  is  already  available  in  the  explanation-based  network.  45 
Modeling  conservatism  in  revising  domain  knowledge  can  help  us  understand  the  factors  that 


45  Some  relevant  metaknowledge  already  included  in  the  model:  (1)  the  number  of  (preferred)  explanations 
supported  by  a  belief;  (2  )  the  ratio  of  preferred  explanations  to  non-preferred  explanations  supported  by  a  belief;  (3) 
the  alternate  explanations  for  explanandums;  (4)  concept-level  preferences  between  beliefs  and  model  fragments; 
and  (5)  the  conditional  probability  of  using  some  belief  in  an  explanation  given  another  belief  is  also  used. 
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make  misconceptions  resilient,  and  it  might  have  a  practical  benefit  of  helping  cognitive  systems 
avoid  unnecessary  computation. 

Another  reason  why  conceptual  change  takes  much  longer  in  people  is  that  people  must  sift 
the  relevant  from  the  irrelevant,  and  deal  with  incomplete  and  noisy  information.  The  training 
data  in  Chapters  5  and  8  are  automatically  encoded  from  comic  graphs,  which  we  believe  is  an 
important  first  step  in  simulating  conceptual  change  over  larger  timescales;  even  so,  the  stimuli 
are  sparser  than  observations  in  the  real  world.  All  else  being  equal,  adding  extraneous  entities 
and  relations  to  the  training  data  will  make  analogical  retrieval  less  effective  and  delay  the 
discovery  of  qualitative  proportionalities  via  analogy  (e.g.,  in  Chapter  8),  which  will  slow  the 
rate  of  learning.  Additionally,  the  comic  graphs  segment  each  observation  into  meaningful 
qualitative  states,  where  the  real  world  is  continuous.  Since  the  system  derives  quantity  changes 
from  these  states  rather  than  observing  them  directly,  it  does  not  have  to  differentiate  quantities 
such  as  speed,  velocity,  and  acceleration,  which  is  difficult  for  novice  students  (Dykstra  et  ah, 
1992).  Using  a  3D  physics  engine  as  a  learning  environment  (e.g.,  Mugan  and  Kuipers,  2010)  is 
a  promising  direction  for  providing  more  realistic  stimuli,  though  sparseness  is  still  an  issue. 

Memory  retrieval  might  also  contribute  to  the  duration  of  human  conceptual  change.  In  our 
model,  changes  are  propagated  by  (1)  encountering  a  new  scenario  that  needs  explaining,  (2) 
retrieving  previous,  similar  scenarios  from  memory  and  then  (3)  using  the  models  and  categories 
from  the  previous  explanations  to  solve  a  new  problem.  Since  people  are  most  often  reminded  of 
literally  similar  phenomena  (Forbus,  Gentner,  and  Law,  1995),  they  might  fail  to  reuse  models 
and  categories  to  explain  entire  classes  of  relevant  -  but  not  literally  similar  -  a  phenomena. 

This  would  produce  tightly-contextualized  mental  models,  as  is  evident  in  Collins  and  Gentner’ s 
(1987)  study  of  novice  mental  models  of  evaporation  and  diSessa  et  al.’s  (2004)  study  of  novice 
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mental  models  of  force.  As  a  result,  when  there  are  more  analogs  in  memory,  mental  models 
could  become  more  tightly  contextualized,  and  conceptual  change  might  become  more  difficult. 

A  final  consideration  for  the  timescale  of  conceptual  change  is  that  presently,  our 
simulations  perform  conceptual  change  in  isolation.  If  the  agent  had  other  operations  and 
incoming  observations  to  attend  to,  then  it  could  not  dedicate  as  much  time  to  retrospective 
explanation  operations.  This  means  that  a  greater  share  of  its  observations  would  be  explained 
using  outdated  knowledge,  all  else  being  equal.  This  would  ultimately  increase  the  likelihood 
that  outdated  knowledge  gets  reused  and  propagated,  delaying  the  rate  of  change. 

9.4.2  Improving  explanation  construction 

Our  model  considers  more  possibilities  than  people  seem  to  consider  when  it  constructs 
explanations.  For  instance,  in  one  of  the  simulation  trials  in  Chapter  7,  the  system  constructs  16 
distinct  explanations  for  why  Chicago’s  seasons  change.  It  then  evaluates  each  explanation  and 
chooses  the  explanation  that  the  student  gives  in  the  interview  transcript.  However,  the 
corresponding  student  in  the  study  seems  to  incrementally  generate  a  single  explanation  for 
Chicago’s  seasons  over  several  minutes. 

One  solution  to  this  problem  is  to  turn  our  abductive  model  formulation  algorithm  into  an 
incremental  beam  search.  This  would  mean  that  as  it  back-chains  and  instantiates  model 
fragments,  the  algorithm  only  considers  the  lowest  cost  (i.e.,  simplest  or  most  probable) 
alternative  that  it  has  not  yet  considered.  This  would  construct  a  single  explanation,  but  the 
problem  of  estimating  which  path  is  lowest  cost  is  difficult  without  looking  ahead.  Another  idea 
for  focusing  search  is  to  use  other  explanations  to  guide  the  search  for  a  new  explanation:  if 
other,  preferred  explanations  tend  to  chain  from  model  fragment  A  to  model  fragment  B  over 
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other  alternatives,  do  the  same  in  this  case.  Alternatively,  the  system  could  apply  the  old 
explanation  via  analogical  mapping  and  inference.  Since  analogical  inference  is  not  necessarily 
sound,  the  system  could  perfonn  focused  abductive  reasoning  to  fill  in  gaps  in  justification 
structure  that  are  not  inferred. 

Another  solution  is  to  keep  the  same  model  formulation  algorithm  and  implement  a  greedy 
walk  through  the  resulting  justification  structure  to  only  reify  a  single  well-founded  explanation 
at  a  time.  This  back-loads  the  work,  since  when  the  system  needs  to  perform  belief  revision,  it 
will  have  to  consider  alternative  paths  through  justification  structure. 

9.4.3  Improving  explanation  evaluation 

We  have  described  two  means  of  computing  preferences  between  explanations:  cost  functions 
and  rules.  However,  these  are  only  as  effective  as  the  cost  bases  and  the  contents  of  the  rules, 
respectively.  At  present,  we  do  not  believe  that  either  of  these  is  complete.  One  gap  in  our  cost 
function  is  that  it  only  penalizes  for  inclusion  of  artifacts  such  as  contradictions  and  assumptions, 
but  it  does  not  penalize  for  omission  of  beliefs  within  an  explanation.  For  example,  a  student 
might  be  confident  that  the  tilt  of  the  earth  is  related  to  the  changing  of  the  seasons,  but  unsure  of 
the  specific  mechanics  (e.g.,  Sherin  et  ah,  2012).  Consequently,  any  explanation  the  student 
constructs  that  omits  the  earth’s  tilt  should  be  penalized.  This  might  be  simulated  by  encoding  a 
metaknowledge  relation  to  conceptually  associate  the  belief  that  the  earth  has  a  tilted  axis  with 
the  belief  that  the  seasons  change. 

Rules  and  cost  functions  might  also  be  extended  to  capture  other  psychological  explanatory 
virtues  (Lombrozo,  2011).  For  instance,  we  can  compute  the  conditional  probability  of  multiple 
inferences  to  determine  an  explanation’s  perceived  probability.  Other  explanatory  virtues 
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include  the  diversity  of  knowledge  within  an  explanation,  scope,  fruitfulness,  goal  appeal,  and  fit 
within  a  narrative  structure.  Some  of  these  may  be  computable  based  on  the  metaknowledge  in 
the  network  structure,  while  others,  such  as  narrative  structure,  might  require  comparison  to 
other  explanations  and  generalizations. 

9.4.4  Other  types  of  agency 

Right  now,  our  system  explains  quantity  changing  using  knowledge  of  physical  mechanisms,  but 
physical  mechanisms  are  only  type  of  causality.  Dennett  (1987)  and  Keil  &  Lockhart  (1999) 
identify  three  main  types  of  causality:  (1)  mechanical,  which  we  address  here;  (2)  intentional; 
and  (3)  teleology/design/function.  Most  adults  explain  why  a  boat  floats  via  mechanical 
causality,  using  knowledge  of  density  and  buoyancy.  Piaget  (1930)  found  that  children 
frequently  ascribe  intentions  (e.g.,  the  boat  doesn’t  want  to  sink)  or  teleology  (e.g.,  it  floats  so  we 
can  ride  on  top  of  the  water)  to  physical  situations.  This  results  in  anthropocentric  finalism, 
where  natural  phenomena  are  explained  relative  to  their  function  for  humans,  or  animism,  where 
nonliving  things  are  assigned  lives  and  intentions.  Having  the  system  leam  when  to  use  which 
agency,  e.g.,  by  contextualizing  and  reusing  them  by  similarity  or  by  using  modeling 
assumptions  (discussed  below),  is  an  interesting  opportunity  to  model  these  aspects  of  cognitive 
development  as  conceptual  changes. 

For  example,  it  is  possible  that  the  two  students  in  Chapter  7  who  were  not  modeled  by  our 
simulation  arrived  at  their  final  model  using  teleological  explanation.  Recall  that  the  two 
students  who  were  not  simulated  in  Chapter  7  were  in  the  “prompted”  condition,  where  students 
explained  to  themselves  while  reading.  Both  used  the  incorrect  “single  loop  (lung)”  model  of  the 
circulatory  system  at  the  posttest,  where  blood  flows  from  the  heart,  through  the  lungs,  to  the 
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body,  and  back.  These  students  generated  erroneous  components  within  their  mental  models 
through  self-explanation.  More  specifically,  they  might  have  (1)  understood  that  the  function  of 
the  lungs  is  to  oxygenate  the  blood  for  eventual  delivery  to  the  body  and  (2)  inferred  the 
structure  of  the  circulatory  system  by  attending  to  this  lung  function. 

9.4.5  Taking  analogy  further 

In  Chi  et  al.  (1994),  students  made  spontaneous  analogies  such  as  “the  septum  [of  the  heart]  is 
like  a  wall  that  divides  the  two  parts”  when  explaining  textbook  material.  While  our  system  uses 
analogy  to  retrieve  similar  examples  and  infer  qualitative  proportionalities  (Chapter  8),  it  does 
not  make  spontaneous  analogies  to  transfer  knowledge  across  domains.  Analogical  inference  is  a 
powerful  strategy  worth  incorporating  into  our  model  of  explanation  construction.  We  can 
sketch  this  idea  very  generally.  As  new  infonnation  (e.g.,  about  the  septum  dividing  the  sides  of 
the  heart)  is  incorporated  via  reading,  it  can  be  used  as  a  probe  to  MAC/FAC,  which  can  retrieve 
similar  concepts  (e.g.,  a  wall  dividing  two  spaces).  The  SME  mapping  between  the  new  and 
existing  concepts  produces  candidate  inferences,  which  can  elaborate  the  new  material  with 
respect  to  surface  characteristics,  function,  and  causal  structure.  As  mentioned  in  Chapter  3, 
analogical  inferences  might  not  be  deductively  valid,  so  this  might  produce  additional 
misconceptions  (Spiro  et  al  1989). 

When  analogies  are  communicated  through  instruction  or  text,  they  have  the  capability  to 
foster  conceptual  change  (Brown,  1994;  Gentner  et  al.,  1997;  Vosniadou  et  al.,  2007).  These  are 
important  considerations  for  extending  the  system  further.  For  example,  bridging  analogies 
(Brown  &  Clement,  1989)  can  be  used  to  facilitate  the  transfer  of  knowledge  from  a  correct  base 
scenario  (e.g.,  an  outstretched  hand  exerts  an  upward  force  on  a  book  at  rest  on  its  surface)  to  a 
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flawed  target  scenario  (e.g.,  a  table  does  not  exert  an  upward  force  on  a  book  at  rest  on  its 
surface).  Through  a  sequence  of  bridging  analogies,  such  as  a  book  on  a  spring,  a  book  on  a 
mattress,  and  a  book  on  a  pliable  board,  beliefs  are  imported  into  the  target  scenario.  This 
pennits  the  construction  of  new  explanations  that  can  replace  the  old,  flawed  explanations.  Since 
analogical  mapping  and  transfer  are  built  into  Companions  cognitive  architecture,  this  is  a 
reasonable  next  step. 

9.4.6  Accruing  domain  knowledge 

The  simulations  in  Chapters  5  and  8  acquire  model  fragments  by  induction  and  heuristics, 
respectively.  By  contrast,  the  simulations  in  Chapters  6  and  7  start  with  hand-coded  model 
fragments,  based  on  pretests,  posttests,  and  interview  transcripts  with  students.  In  these 
simulations,  we  did  not  model  how  the  initial  qualitative  models  of  contained  fluid,  fluid  flow, 
fluid  enrichment,  astronomical  heating,  astronomical  orbit,  and  so-forth,  are  acquired  by  the 
students.  Presumably,  people  leam  about  these  processes  and  relationships  by  some  combination 
of  interaction,  reading,  and  observation,  and  hand-coding  these  representations  is  not  good 
practice  for  cognitive  modeling  in  the  long  term.  A  more  ideal  solution  is  to  automatically 
encode  the  initial  knowledge  of  a  student  using  a  natural  language  understanding  (NLU)  system 
with  deep  semantic  interpretation  (e.g.,  Tomai  &  Forbus,  2009)  to  analyze  an  interview  transcript 
in  order  to  automatically  construct  the  initial  set  of  model  fragments. 

Acquiring  new  qualitative  model  fragments  from  text  is  an  unsolved  problem,  but  there  have 
been  advances  in  deriving  QP  theory  interpretations  from  natural  language  (Kuehne,  2004), 
which  is  an  important  component  of  learning  reusable  models. 
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9.4.7  Storing  explanations 


Preferred  explanations  are  central  organizing  structures  in  our  model,  so  they  persist  over  time. 
Explanations  that  are  not  preferred  also  persist  over  time  in  our  model  because  they  might 
eventually  become  preferred  through  a  belief  revision  process,  as  in  Chapter  7.  Explanations  are 
very  compact46  in  our  system,  but  the  justification  structure  requires  considerably  more  storage. 
We  did  not  encounter  a  perfonnance  degradation  or  storage  bottleneck  due  to  the  algorithms  and 
explanation-based  knowledge  organization  described  here,  but  problems  could  arise  if  we 
imagine  a  lifetime  of  experience  and  learning.  These  are  important  considerations  for  cognitive 
modeling  as  well  as  for  perfonnance  over  time. 
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Figure  55:  Using  SAGE  to  cluster  explanandums  so  that  one  explanation  can 
justify  multiple  observations  that  are  close  analogs  of  one  another. 

Storing  the  justification  structure  for  all  of  the  explanations  in  our  system  saves 
computation,  but  it  creates  a  potential  storage  bottleneck.  We  could  feasibly  store  each 
explanation  as  (B,  M),  and  re-derive  the  justification  structure  when  necessary  using  the 
explanation  construction  algorithm  over  the  beliefs  and  model  fragments  in  B  alone.  This  would 
constrain  the  search  for  explanations  to  only  the  beliefs  and  model  fragments  within  the  previous 


46  Each  explanation  {/,  B,  M)  is  lightweight  because  the  set  of  beliefs  B  and  explanandums  M  are  determinable  based 
on  the  set  of  justifications  J.  Consequently,  the  storage  requirement  for  each  explanation  includes  a  symbol  for  itself 
and  a  set  of  symbols  indicating  its  justifications. 
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explanation.  This  relaxes  the  psychological  assumption  that  people  retain  all  of  the  justifications 
for  their  beliefs,  but  it  still  assumes  that  people  retain  their  preferred  explanations. 

9.4.8  Clustering  explanandums 

In  addition  to  retaining  its  preferred  explanations,  the  system  retains  its  preferred  explanation  for 
each  explanandum.  This  means  that  whenever  a  new  phenomenon  is  explained,  a  preferred 
explanation  is  associated  with  that  exact  phenomenon  in  memory.  As  explanandums  are 
encountered  and  explained,  this  may  become  intractable,  so  this  it  might  be  an  unrealistic 
psychological  assumption.  One  way  to  relax  this  assumption  is  to  (1)  use  analogical 
generalization  to  cluster  explanandums  using  unsupervised  learning  and  then  (2)  explain  each 
generalization.  This  is  illustrated  in  Figure  55. 

This  saves  space  as  well  as  computation.  For  instance,  consider  that  the  agent  must  explain 
why  a  ball  rolls  to  the  left  after  being  kicked,  and  it  has  a  SAGE  generalization  describing 
examples  of  this  very  phenomenon.  If  the  generalization  has  already  been  explained  by  some 
explanation  x,  then  no  first-principles  reasoning  has  to  occur  to  explain  the  ball  rolling  to  the  left 
-  the  agent  merely  has  to  construct  an  analogical  mapping  to  the  generalization  and  ensure  that 
the  generalization’s  explanation  x  holds  on  the  new  explanandum.  This  means  that  the  system 
would  only  generate  new  explanations  if  it  encounters  an  explanandum  that  is  not  structurally 
similar  to  a  previous  generalization  or  explanandum. 

This  idea  of  generalized  explanandums  is  similar  to  the  idea  of  storing  prototype  histories 
(Forbus  &  Gentner,  1986)  which  describe  generalizations  of  phenomena  occurring  over  time. 

We  have  demonstrated  in  (Friedman  and  Forbus,  2008)  that  SAGE  can  leam  these  from 
examples,  so  it  is  a  reasonable  optimization. 
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9.4.9  Proactivity 

These  simulations  perform  conceptual  change  as  a  result  of  observing  the  world  and  receiving 
instructional  material.  This  does  not  capture  the  more  active  aspects  of  human  learning,  such  as 
asking  questions,  planning,  experimenting,  and  teaching  others.  User  interaction  and  user 
modeling  are  central  goals  of  the  Companions  cognitive  architecture  (Forbus  et  ah,  2009)  within 
which  this  model  is  implemented,  so  progress  is  being  made  on  several  of  these  social  interaction 
fronts.  In  terms  of  experimentation,  the  present  model  provides  some  support  for  active  learning. 
For  example,  provided  the  hypothesis  the  distance  a  box  slides  is  inversely  proportional  to  its 
size,  the  system  might  test  this  hypothesis  by  retrieving  previous  example,  increasing  the  size  of 
the  object,  and  requesting  a  training  datum  of  the  modified  observation.  Provided  this  new, 
solicited  observation,  the  system  could  detect  and  resolve  explanation  failures  as  already 
described  in  Chapter  8. 

9.4.10  Applying  the  model  of  conceptual  change 

The  model  of  conceptual  change  presented  here  might  be  practically  applied  within  intelligent 
tutoring  systems  (ITS;  e.g.  Koedinger  et  ah,  1997).  ITSs  automatically  deliver  customized 
feedback  to  a  student  based  on  the  student’s  performance.  They  often  include  a  task  model  to 
represent  expert  knowledge  and  a  student  model  to  track  student  knowledge.  Both  are  crucial  for 
diagnosing  student  misconceptions,  tracking  progress,  and  selecting  new  problems  to  maximize 
learning.  Our  computational  model  of  conceptual  change  uses  a  single  knowledge  representation 
strategy  to  represent  student  misconceptions  and  correct  scientific  concepts  in  several  scientific 
domains,  including  dynamics,  biology,  and  astronomy.  Consequently,  the  model  might 
ultimately  be  integrated  into  ITSs  to  (1)  represent  an  extendable  library  of  student  knowledge, 
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(2)  discover  a  student’s  mental  models  using  active  learning  in  a  tutoring  session,  (3)  find 
inconsistencies  in  a  student’s  mental  models  via  abduction  and  qualitative  simulation,  and  (4) 
guide  the  student  through  a  curriculum  to  confront  and  remedy  the  inconsistencies  according  to  a 
simulation  of  conceptual  change  using  his  or  her  mental  models.  This  requires  substantial 
additional  work,  but  progress  in  using  compositional  modeling  for  tutoring  systems  has  been 
made  by  de  Koning  et  al.  (2000)  and  others.  It  could  lead  to  Socratic  tutors  that  have  human-like 
flexibility  (e.g.,  Stevens  &  Collins,  1977). 
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Appendix 


Definitions 

We  define  several  terms  in  the  below  table  for  ease  of  reference  and  clarity.  Since  we  are 
concerned  with  learning  over  time,  we  use  the  term  “memory”  to  refer  to  long-tenn  memory, 
unless  otherwise  specified. 


Term/Symbol 

Definition 

belief 

A  proposition  represented  as  a  relation  rein  and  at  least  one 

arguments  {a0,  ...,  an),  written  as  ( rein  ao  ...  an). 

model  fragment  belief 

A  belief  referring  to  the  existence  of  a  model  fragment  m,  of  the 

form  (isa  m  Model  Fragment) 

scenario  microtheory 

A  microtheory  that  includes  beliefs  and  model  fragments.  Each 

scenario  microtheory  represents  information  gathered  via 

observation,  instruction,  or  other  type  of  interaction. 

e  =  {b0,...,bn} 

The  domain  knowledge  microtheory,  which  contains  beliefs  that 

can  be  believed  regardless  of  whether  they  are  used  in  an 

explanation.  This  includes  explanandums,  model  fragment 

beliefs,  and  other  beliefs  from  observation  and  instruction. 
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Inherits  all  beliefs  from  all  scenario  microtheories. 

Da  —  {b0, ... ,  bm}  Q  ID) 

The  adopted  domain  knowledge  microtheory  is  the  subset  of  the 

domain  knowledge  microtheory  that  is  presently  believed  by  the 

agent.  For  example,  the  agent  may  have  the  propositional  belief 

“the  heart  oxygenates  the  blood”  in  ID)  and  not  in  Ba.  This 

permits  the  agent  to  reason  about  a  beliefs  consequences  without 

believing  it. 

explanandum 

A  phenomenon  that  requires  an  explanation,47  represented  as  a  set 

m  of  one  or  more  beliefs  {b0, ... ,  bm}.  In  our  simulations,  these 

range  from  sets  of  multiple  beliefs  (e.g.,  describing  flood  flowing 

from  the  heart  to  the  body  in  Chapter  7)  or  sets  of  single  beliefs 

(e.g.,  describing  quantity  changes  in  Chapter  8).  Each 

explanandum  is  believed  in  ID). 

II 

The  set  of  all  explanandums  in  the  agent’s  memory. 

B  =  {b0,...,bn} 

The  provisional  belief  microtheory,  containing  beliefs  that  are 

either  assumed  or  inferred  from  other  knowledge.  Beliefs  in  this 

microtheory  are  only  believed  if  they  are  in  an  explanation  that  is 

adopted  by  the  agent. 

justification 

Rationale  for  belief.  Includes  rule-based  inferences,  model 

47  The  term  “explanandum”  has  been  used  to  describe  a  phenomenon  requiring  an  explanation.  The  explanandum  is 
typically  the  subject  of  a  why  question  rather  than  a  what  question  (Hempel  &  Oppenheim,  1948). 
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fragment  instantiations,  model  fragment  activations,  and  other 

rationale.  Each  justification  j  has  a  set  of  antecedent  beliefs 

antes(j )  and  a  set  of  consequence  beliefs  conseqs(j ),  such  that 

the  conjunction  of  antes(j )  is  sufficient  to  believe  conseqs(j). 

explanation 

Uniquely  defined  as  {J,  B,  M).  Represents  a  well-founded 

explanation  J  Q  } J  for  some  explanandum(s)  M  Q  M.  B  is  a  set  of 

beliefs  comprised  of  (1)  beliefs  supporting  M  through  J  and  (2) 

metaknowledge48  Bm  about  the  explanation.  More  formally: 

B  =  Bm  U  |l  antes(j )  U  conseqs(j ) 

jej 

explanation  microtheory 

A  single  explanation  microtheory  exists  for  each  explanation 

(J,  B,  M).  Contains  all  beliefs  B  in  the  explanation,  and  is  a 

proper  subset  of  one  or  more  beliefs  in  B  and  ID). 

assumption 

An  unjustified  belief  b  G  B  that  is  not  part  of  the  domain 

theory  ID).  More  formally,  b  is  assumed  in  an  explanation 

(J,  B,  M)  if  and  only  if  it  is  part  of  the  well-founded  explanation 

but  it  is  not  itself  justified. 

explanation  competition 

Occurs  between  two  different  explanations  (J,  B,  M)  and  (J\  B  ’, 

M’)  for  explanandum  m  if  and  only  if  m  G  (M  fl  M’). 

|:s  In  our  simulations,  metaknowledge  about  an  explanation  include  the  beliefs  about  the  structure  of  an  explanation, 
such  as  the  presence  of  an  asymmetric  quantity  change  in  a  cyclic  state  space  (e.g.,  Chapter  6).  These  beliefs  affect 
how  preferences  are  computed  between  explanations,  but  they  do  not  affect  the  justification  structure. 
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preferred  explanation 

For  an  explanandum  m,  the  agent’s  preferred  explanation. 

E  =  {(m0,  x0), ... ,  (mn,  x„)} 

The  explanandum  mapping  over  every  explanandum  m  G  M  to  its 

respective  best  explanation  xt.  Exhaustive  over  all  explanandums 

M,  but  not  necessarily  over  all  explanations. 

Transcript  of  an  interview  about  the  seasons  from  Chapter  6 

Below  is  a  transcript  of  the  student  “Angela,”  courtesy  of  Sherin  et  al.  (2012).  We  have  removed 
symbols  that  indicate  gestures,  emphasis,  and  pauses,  but  we  have  kept  some  nonverbal 
annotations  where  helpful  for  understanding  the  conversation.  A  =  Student,  B  =  Interviewer. 


Who  Transcript 

IB  I  want  to  know  why  it's  warmer  in  the  summer  and  colder  in  the  winter 

2  A  That’s  because  like  the  sun  is  in  the  center  and  the  Earth  moves  around  the  sun 

and  the  Earth  is  at  one  point  like  in  the  winter  it’s  like  farther  away  from  the  sun- 

3  B  uh  huh- 

4  A  and  towards  the  summer  it's  closer  it’s  near  towards  the  sun. 

5  B  I  think  I  get  it.  Can  you  just  draw  a  picture  so  I'm  completely  sure? 

6  A  Okay.  The  sun's  in  the  middle  and  uh- 


7  B 


Mmhm.  Nice  sun. 
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8  A 

9  B 

10  A 

11  B 

12  A 

13  B 

14  A 

15  B 

16  A 

17  B 

18  A 

19  B 

20  A 

21  B 

22  A 


and  then  the  uh  the  Earth  kind  of  orbits  around  it 
uh  huh 

And  um  like  say  at  one  it’s  probably  more  of  an  ovally  type  thing  - 
Mmhm 

In  the  winter,  and  uh  er  probably  this  will  be  winter  ((moves  pen  tip  to  the 
opposite  side  of  the  orbit  and  draws  a  new  Earth))  since  it's  further  away 

Mmhm 

See,  that's,  winter  would  be  like,  the  Earth  orbits  around  the  sun.  Like  summer  is 
the  closest  to  the  sun.  Spring  is  kind  of  a  little  further  away,  and  then  like  fall  is 
further  away  than  spring,  but  like  not  as  far  as  winter 

Mm  hmm 

and  then  winter  is  the  furthest, 
mm  hmm 

So  the  sun  doesn’t,  it  like  the  flashlight  and  the  bulb  ((hand  opening  gesture  over 
the  sun,  as  if  her  fingers  were  the  sun ’s  rays  spreading  out),  it  hits  summer, 

Mm  hmm 

the  lines  like  fade  in  ((draws  fading  lines  from  sun  to  summer)),  and  get  there 
closer,  like  quicker 

mm  hmm 

And  by  the  time  they  get  there  [winter] ,  they  kinda  fade  and  it's  gets  a  lot  colder 


for  winter 
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23  B  mm  hmm 

24  A  And  spring  it's  uh  kinda  ((gesturing  between  the  sun  and  the  earth  labeled 

spring ))  between  the  two  [between  winter  and  summer]  and  same  for  fall 

25  B  Mm  hmm.  Mm  hmm.  Um,  Is  this  something  -  have  you  done  this  already  for 

your  class  -  is  that  you  know  this  from? 

26  A  Uh,  kind  of,  like  from  first  and  second  grade  I  remember  the  time  that  the  Earth 

orbiting  and  whatnot. 

27  B  mm  hmm,  mm  hmm.  Okay.  So  that  makes  a  lot  of  sense.  Um.  One  thing  I 

wanted  to  ask  you  though  about  though  was  one  thing  that  you  might  have  heard 
is  that  at  the  same  time  -  and  you  can  tell  me  if  you've  heard  this  -  when  it's 
summer  here  ((B  taps  the  table  top)),  it's  actually  winter  in  Australia. 

28  A  mm  hmm 

29  B  Have  you  heard  that  before? 

30  A  Yeah. 

3 1  B  So  I  was  wondering  if  your  picture  the  way  you  drew  it  can  explain  that  or  if 

that's  a  problem  for  your  picture. 

32  A  Uhhhh.  Idea.  I  need  another  picture. 

33  B  Okay.  So  is  that  a  problem  for  your  picture? 

34  A  Yeah,  that  is.  Um,  ok.  ((A  draws  in  a  new  sun,  with  smiley  face,  on  her  new  piece 

of  paper.))  There  is  like  the  sun.  And  okay.  Yeah.  ((A  drawing  a  new  elliptical 
orbit  around  the  sun.))  I  remember  that  now  cause,  um,  it's  like,  as  the  world  is 
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rotating,  or  as  it's  orbiting 

35  B  Mm  hmm 

36  A  it's  rotating  too.  So  uh,  I  don’t  really  -  I  guess  I  don’t  really  understand  it.  Um. 

37  B  Well,  you’re  saying  as  the  Earth  is  going  around  here  ((B  sweeps  once  around  the 

orbit  A  has  drawn.))  it's  doing  what? 

38  A  It’s  like  spinning.  ((A  again  makes  the  quick  “rotating”  gesture  between  her 

thumb  and  first  finger  and  she  traces  out  the  drawn  orbit.))  Because  it's.  That’s 
how  it's  day  and  night  too. 

39  B  I  see.  It’s  spinning  like  a  top.  ((B  makes  a  “spinning”  gesture  above  A ’s 

diagram.)) 

40  A  Yeah. 

41  B  Okay. 

42  A  So,  I  guess  I  really  don’t  understand  it,  that  much.  But.  Uum.  Yeah,  I  have  heard 

that  [that  when  it  is  summer  in  Chicago,  it  is  winter  in  Australia] ,  cause  I  was 
supposed  to  go  to  Australia  this  summer 

43  B  Uh  huh. 

44  A  but  it  was  going  to  be  winter 

45  B  Uh  huh. 

46  A  when  I  was  going,  but  uh,  their  winters  are  really  wann.  So, 

47  B  Mm  hmm.  So  you're  thinking  that  somehow  the  spinning,  you  thought  that 

somehow  if  you  take  into  account  the  fact  that  the  Earth  is  also  spinning,  that 
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might  help  to  explain  why  it's  summer  and  winter  at  different  times 

48  A  Uh  -  yeah. 

49  B  That’s  what  you  were  thinking? 

50  A  Uh,  kinda.  Yeah. 

51  B  Just  to  be  clear,  what  was  -  What  was  the  problem  with  this  picture  for  the- 

52  A  Because,  yeah  I  rethought  that  and  it  looks  really  stupid  to  me  because  um 

summer  is  really  close  but  then  how  could  it  be  like  winter  on  the  other  side. 
Well.  How  could  it  be  winter  on  the  other  side  if  it's  really  close  here  ((pointing 
to  summer  earth),  and  how  could  it  be  really  warm  if  this  (( pointing  to  winter 
earth))  is  this  far  away.  I  don’t  know.  That  looks  really  dumb  to  me  now.  So. 

53  B  It  doesn't  look  really  dumb  to  me.  A  lot  of  people  explain  it  this  way.  Um.  Okay, 

I'm  not  going  to  give  away  answers  yet.  You  can  find  this  out  -  you  can  find  this 
out  in  your  class. 

Rules  for  detecting  contradictions 

The  system  uses  the  following  pairs  of  statement  patterns  to  detect  contradictions.  We  do  not 
believe  this  list  is  complete  for  all  tasks,  but  it  is  complete  for  the  tasks  involved  in  the 
simulations  in  Chapters  5-8.  Each  symbol  beginning  with  a  question  mark  (?)  is  a  variable. 

Statement  1  Statement  2 

?x  (not  ?x) 

(greaterThan  ?x  ?y)  (lessThanOrEqualTo  ?x  ?y) 
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(lessThan  ?x  ?y)  (greaterThanOrEqualTo  ?x  ?y) 

(greaterThan  ?x  ?y)  (greaterThan  ?y  ?x) 

(lessThan  ?x  ?y)  (lessThan  ?y  ?x) 


Note  that  there  are  rules  for  inferring  lessThanOrEqualTo  from  lessThan  and  equalTo,  and 
likewise  for  greaterThanOrEqualTo.  Also,  contradictory  quantity  changes  are  covered  by 
the  above  ordinal  relation  pairs,  since  a  quantity  q' s  continuous  change  in  value  is  represented  as 
an  ordinal  relation  describing  its  derivative.  For  example,  if  q  is  increasing,  we  represent  this  as 
(greaterThan  (DerivativeFn  q)  0 )  .  This  means  that  if  the  system  infers  that  a  quantity 
is  increasing  and  decreasing  in  the  same  time  interval  (e.g.,  in  the  seasons  simulation  in  Chapter 
6),  it  can  detect  the  contradiction  with  the  above  rules. 

Sentences  from  a  textbook  passage  about  the  circulatory  system 

These  sentences  were  used  to  generate  the  instructional  knowledge  for  the  simulation  in  Chapter 
7.  Sentence  numbers  correspond  to  the  sentence  numbers  in  Chi  et  al.  (2001).  These  sentences 
comprise  the  “structure”  portion  of  the  passage  (Chi  et  al.,  1994a). 

1 .  The  septum  divides  the  heart  lengthwise  into  two  sides. 

2.  The  right  side  pumps  blood  to  the  lungs,  and  the  left  side  pumps  blood  to  the  other  parts 
of  the  body. 

3.  Each  side  of  the  heart  is  divided  into  an  upper  and  a  lower  chamber. 

4.  Each  lower  chamber  is  called  a  ventricle. 

5.  In  each  side  of  the  heart  blood  flows  from  the  atrium  to  the  ventricle. 
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6.  One-way  valves  separate  these  chambers  and  prevent  blood  from  moving  in  the  wrong 
direction. 

7.  The  atrioventricular  valves  (a-v)  separate  the  atria  from  the  ventricles. 

8.  The  a-v  valve  on  the  right  side  is  the  tricuspid  valve,  and  the  a-v  valve  on  the  left  is  the 
bicuspid  valve. 

9.  Blood  also  flows  out  of  the  ventricles. 

10.  Two  semilunar  (s-1)  valves  separate  the  ventricles  from  the  large  vessels  through  which 
blood  flows  out  of  the  heart. 

1 1 .  Each  of  the  valves  consists  of  flaps  of  tissue  that  open  as  blood  is  pumped  out  of  the 
ventricles. 

12.  Blood  returning  to  the  heart,  which  has  a  high  concentration,  or  density,  of  carbon 
dioxide  and  a  low  concentration  of  oxygen,  enters  the  right  atrium. 

13.  The  atrium  pumps  it  through  the  tricuspid  valve  into  the  right  ventricle. 

14.  The  muscles  of  the  right  ventricle  contract  and  force  the  blood  through  the  right 
semilunar  valve  and  into  vessels  leading  to  the  lungs. 

15.  Each  upper  chamber  is  called  an  atrium. 

16.  In  the  lungs,  carbon  dioxide  leaves  the  circulating  blood  and  oxygen  enters  it. 

17.  The  oxygenated  blood  returns  to  the  left  atrium  of  the  heart. 

18.  The  oxygenated  blood  is  then  pumped  through  the  bicuspid  valve  into  the  left  ventricle. 

19.  Strong  contractions  of  the  muscles  of  the  left  ventricle  force  the  blood  through  the 
semilunar  valve,  into  a  large  blood  vessel,  and  then  throughout  the  body. 


